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New Process 



The present invention relates to new processes for improving the mam^ 
clavams e.g. clavulanic acid. The present invention also provides novel DNA sequences and 

5 new microorganisms capable of producing increased amounts of clavulanic add. 

Microorganisms, in particular Streptomyces sp. produce a number of antibiotics 
including clavulanic acid and other clavams, cephalosporins, polyketides, cephamycins, 
tunicamycin, holomycin and penicillins. There is considerable interest in being able to 
manipulate the absolute and relative amounts of these antibiotics produced by the 

10 microorganism and accordingly there have been a large number of studies investigating the 
metabolic and genetic mechanisms of the biosynthetic pathways (Demain, A.L. (1990) 
"Biosynlhesis and regulation of p-lactam antibiotics." in "50 years of Penicillin applications, 
history and trends"). 

Streptomyces clavuligerus produces two major groups of antibiotics; one being the 
15 cephamycms, cephalosporins and penicillins Remain, A.L. (1990) supra) and the other 

comprising clavams. Clavams can be arbitrarily divided into two groups, 5S and 5R clavams, 
dependent on their ring stereochemistry. The commercially important clavam clavulanic 
acid, a component of the antibiotic Augmentin (trade mark of GlaxoSmithKline), is a 5R 
clavam. Examples of 5S clavams are clavam-2-carboxylate (C-2-C), 2-hydroxymethyl 
20 clavam (2HMC) and alanylclavam (Brown et al. (1979) J. Chem. Soc. Chem. pp282-283). 

Genes encoding biosynthetic enzymes and regulatory proteins for clavulanic acid 
production have been located in a cluster next to the genes involved in cephamycin C 
production and make up a supercluster of antibiotic related genes within the S. clavuligerus 
genome (Alexander et al (1998) J.Bacteriol. 180:4068-79). For example the genes encoding 
25 the enzymes involved in clavaminic acid production, a clavulanic acid precursor, which 
include orfl {ceaS) (Khaleeli etal. (1999) J. Am. Chem. Soc. 121:9223-9224). orfl (bis) 
(Bachmann and Townsend (1998) Chem. Commun.:2325-2326), orf4 (pah) (Wu et al. (1995) 
J. Bacteriol. 177:3714-3720), oifS (cas2) (Marsh et al. (1992) Biochemistry. 31:12648-57) 
and perhaps oyf6 (Kershaw et al. (2002) Eur. J. Biochem. 269,2052-2059) are all located 
30 withm the clavulanic acid cluster. Disruptions in orfs2-6 cause a complete loss of clavulanic 
acid production when mutant cultures are grown on starch aspaiagine medium (Aidoo. K. A. • 
et al. (1993) p219-236 In. V.P. Gullo, J.C. Hunter-Cevera, R. Cooper and R. K. Johnson (ed.). 
Developments in Industrial Microbiology series, vol.33 Society for Industrial Microbiology, 
Fredericksburg, Va.). However this loss is conditional upon the growth media used for when 
35 mutants are grown on Soy medium (Salowe et al. (1990) Biochemistry 29: 6499-6508) 

clavulanic acid production is partially restored (Jensen et al. (2002) Antimicrob. Agents and 
Chemother. 44: 720-726). This phenomenon could suggest that other genes present in the & 
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clavuligerus genome could compensate in some way for the loss of the activity of these genes 
xmder certain conditions. Altaiiatively it could be that the Soy media contains very small 
amounts one or more of the metabolites produced by the orfs 2-6 allowing strains disrupted in 
these genes to make small amounts of clavulanic acid. 
5 Marsh et al (1992) supra has reported that S. clavuligerus contains two copies of 

the cos gene (casl and cas2). cas\ is not associated with the clavulanic acid gene cluster and 
has a high homology to casl. Disruption of casl decreases clavulanic acid production by 35% 
when cultures are grown on Soy medium and elinmiates production entirely when cultures are 
grown on starch asparagine (SA) medium (Paradkar and Jensen 1995 J3act 177: 1307- 
10 1314). The disruption of the casl gene results in mutants which produce near wild type levels 
of clavulanic acid on SA medium, but produce 31-73% less clavulanic acid when grown on 
Soy medium than the wild type (Mosher et al (1999) Antimicrob. Agents and Chemother. 43: 
1215-1224). It is also reported that in mutant strains where both the casl and cas2 genes have 
been disrupted no clavulanic acid is produced under any of the fermentation conditions tested. 
15 Interestingly when the genes surrounding casl were sequenced, no additional genes involved 
in clavulanic acid production were found but instead six novel genes involved in 5S clavam 
biosynthesis (named cvml to 6) were identified. (Mosher et al (1999) supra). Further work 
on these 5S clavam-specific genes showed that disruption of the genes, using genetic 
engineering methodologies, leads to improvements in the levels of clavulanic acid made by 
20 the mutant strains and also dramatic reductions in the levels of 5S clavam production 

(W098/33896). This reduction in 5S clavam production, in particular the 5S clavam clavam- 
2-carboxylate, is especially important in the commerical production of clavulanic acid 
because some 5S clavams are known to be toxic and for this reason the levels are tightly 
controlled within the British and US Pharmacopoeias. 
25 Despite these advances in the understanding of clavulanic acid biosynthesis it is still a 

highly desirable goal in the pharamceutical industry to continue to improve production 
methods for clavulanic acid, both for reasons of cost and for reasons of safety. 

The following definitions are provided to facilitate understanding of certain terms 
used frequently herein: 

30 "Gene" as used herein also includes any regulatory region required for gene function 

or expression. 

"cvm" genes as used herein refers to any of the genes cvml, cvm2, cvm3, cvm4y cvmS^ 
cvm6 or cvm 7 as defined hereinabove. 

" cvmpara'' genes as used herein refers to any of the genes cvmGpara or cvmJpara as 
35 defined hereinabove. 

" Off' genes as used herein refers to any of the genes orfZ, orf3, orfA, orfS, orf6, orf7, orfS, 
orf9, offlO, offlh orfl2, otfl3, orfl4, orflS, orfl6, orfl7, or orfl8 as defined heremabove. 
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''orfpara'' genes as used herein refers to any of the genes orflpara, orflpara^ orf4pQTa or 
orfSpara as defined hCTeinabove. 

"Disrupted" as used herein means that that the activity of the gene (with regard 5S clavam 
production) has been reduced or eliminated by, for exanq)le, insertional inactivation using an 
antibiotic resistance gene, preferably ai)romycin (Paradkar, A.S and Jensen, S.E (1995) supra)y or 
other mutagenesis technique (for example those disclosed in Sambrook et al (1989) supra). Other 
mutagenesis techniques include insertion of other DNAs (not antibiotic resistance genes), site- 
directed mutagenesis to either change one or more bases in the gene sequence or insert one or 
more bases into the sequence of the gene. 

"Deleted" as used herein means fliat the gene, or a segment fliereof, has been deleted 
(removed) from a larger polynucleotide which, before the deletion was performed, included said 
gene or segment thereof. When the polynucleotide bearing the deletion is introduced into the 
genome of the microorganism by means of gene replacement technology (Paradkar and Jensen 
(1995) supra) the activity of the gene or protein encoded thereby is eliminated or reduced such 
that the levels of 5S clavam produced by the microorganism are reduced. The deletion may be 
large (for example the complete open reading frame with or without regulatory control regions) or 
small (for example a single base pair resulting in a frameshift mutation). 

"Reduced" as used herein means that the levels of 5S clavam produced by the 
microorganism of the invention are lower than the levels produced in the corresponding S. 
20 clavuligerus strain which has not had the relevant open reading frames disrupted or deleted. The 
corresponding 5. clavuligerus is therefore the "parent" strain into which the disrupted or deleted 
open reading frames were subsequently introduced to generate the ndcroorganism of the 
invention. 

"At least maintained" as used herein means that the level of clavulanic acid produced 
25 in the microorganism of tiie invention is the same or greater than that produced in the 
corresponding 5. clavuligerus strain which has not had the relevant open reading frames 
disrupted or deleted. The corresponding 5, clavuligerus is therefore the "parrat" strain into 
which the disrupted or deleted open reading frames were subsequently introduced to generate 
the microorganism of the invention. 

30 

The present invention concerns new processes for making clavulanic acid using 
newly identified S, clavuligerus genes. Using a probe derived from OTf4 a fragment of the S. 
clavuligerus genome has been isolated and has been shown to comprise a number of genes 
that when disrupted are shown to affect 5S and 5R clavam biosynthesis in S clavuligerus. 
35 Sequence analysis of the fragment has indicated tiie presence of a gene showing high 
similarity to orf4 (hereinafter called orf4par\ However surprisingly further sequence 
analysis of the regions flanking the orf4par gene has revealed a new cluster of genes 
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conq)rismg paralogues of genes previovisly identified in both the clavulanic acid (cas2 cluster) 
and 5S clavam (casl cluster) gene clusters. 

Accordingly the invention provides a S. davuligerus microOTganism conqjrising DNA 
5 corresponding to one or more open reading fiames essential for 5S clavam biosynthesis, wherem 

said open reading frames are disrupted or deleted such that the production of 5S clavams by said 

S. davuligerus is reduced and clavulanic acid production is at least maintained, wherein flie open 

reading frames are selected from: 

a) cvm6para (SEQ ID N0:1); 
10 b) cvmJpara (SEQ ID NO:2); 

c) cvm6para and cvm6 (SEQ ID NO:5); or 

d) cvm7para and cvm7 (SEQ ID NO:6). 

In a second aspect the invention provides a S. davuligerus microorganism comprising 
DNA corresponding to one or more open reading frames essential for 5S clavam biosynthesis, 
15 wherein said open reading frames are disrupted or deleted such that «ie production of 58 clavams 
by said S. davuligerus is reduced and clavulanic acid production is at least maintained, wherein 
the open reading frames are selected from: 

a) cvm6para and one or more of cvml (SEQ ID NO:7), cvm2 (SEQ ID NO:8), cvm3 (SEQ ID 
NO:9), cvm4 (SEQ ID NO: 1 0), cvmJ (SEQ ID NO: 1 1), cvm6, cvm7 or cvmlpara; or 
20 b) cvmlpara and one or more of cvml, cvml, cvm3, cvm4, cvm5, cvm6, cvm7 or cvmCpara. 

The genes cvml, cvm2, cvm3, cvm4, cvmS and cvm6 are disclosed in Mosher et al (1999) 
supra and W098/33896 (cvml is orfupl, cvml is mjupl, cvm3 is orfi^i, cvm4 is ordwnl, cvmS is 
orfdwnl and cvm6 is orfdwnS. The cvm7 gene, found to be a further 5S clavam specific gene of 
the 5 S clavam (casl) cluster, has been identified during work leading to the present invention and 

25 is disclosed hereinbelow. 

In a further aspect the invention provides isolated polynucleotides comprising the 

cvm6para and cvm7para open reading frames which are used in the preparation of the S. 

davuligerus microorganism of the invention. Preferably said polynucleotides comprise open 

reading frames selected from the group consisting of: 
30 a) cvm6para; 

b) cvm7para; 

c) cvm6para and cvm6; 

d) cvm7para and cvm7; 

e) cvm6para and one or more of cvml, cvml, cvm3, cvm4, cvmS, cvm6, cvm7 or cvm7para', or 
35 f) cvm7para and one or more of cvml, cvml, cvm3, cvm4, cvm5, cvm6, cvm7 or cvm6para. 

In another aspect the present invention provides vectors for cloning and manipulating the 
cvm polynucleotides disclosed herein and which can be used in the preparation of the S. 
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clavuligerus nucroorganism of the invCTtion. Processes for using these vectors to make the S. 
clavuligerus microorganism of the invention are also provided. 

The encoded polypeptides from cvm6para and cvm7para are also provided by the 
invention (SEQ ID NO:3 and SEQ ID NO:4 respectively). 
5 The invention further provides a polynucleotide conqnising one or more open reading 

frames encoding one or more enzymes involved in clavulanic acid biosynthesis wherein said open 
reading frames are selected from the group consisting of: 

a) orflpara (SEQ ID NO: 12), 

b) orfSpara (SEQ ID NO: 13), 

10 c) off4para (SEQ ID NO: 14), and 

d) orfCpara (SEQ ID NO:15). 

In a further aspect the invention provides a polynucleotide coinprising one or more open 

reading frames encoding one or more enzymes involved in clavulanic acid biosyntiiesis wherein 

said open reading frames con[q[irise one or more of: 
15 a) orflpara^ 

b) orfipara^ 

c) orfiparUy 

d) orf6para 

in combination with one or more genes involved in clavulanic acid biosynthesis selected from 
20 orfl, orfi, orf4, orfS, orf6, orf7, orfS, orp, orflO (Canadian patent application CA21081 13 and 
Jensen, S.E et al (2000) Antimicrob. Agents Chemother 44:720-6) oifll, orfll (Li, R.N et al 
(2000) J. Bacteriol 182:4087-95), orfl3, orfl4, orflS, orfl6, orfll, or orflB (patent application 
PCT/GB02/04989). 

' ' Vectors comprising such polynucleotides are also provided by the present invention 

25 together with processes for the use of such vectors to prepare strains of Streptomyces clavuligerus 
which can be used to produce elevated levels of clavulanic acid. 

Strains of Streptomyces clavuligerus so produced and methods for using them to produce 
clavulanic acid by fermentation are also provided. 

Thus the invention further provides a Streptomyces clavuligerus microorganism 
30 comprising a vector comprising a polynucleotide coinprising one or more open reading frames 

encoding one or more enzymes involved in clavulanic acid biosynthesis wherein said open reading 
frames are selected from the group consisting of: 

a) orflpara^ 

b) orfSpara, 

35 c) orf4para^ and 
d) orf6para. 
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In a further aspect the invention provides a Streptomyces clavuligerus microorganism 
conqjrising a vector conqnising a polynucleotide comprising one or more opai reading fiames 
encoding one or more enzymes involved in clavulanic acid biosynthesis wherein said oprai reading 
firames are selected from Has gcaap consisting of: 
5 a) oTf2para, 

b) orfSpara, 

c) OTf4para, 

d) orfSpara 

in combination with one or more genes involved in clavulanic acid biosynthesis selected from 
10 orp, orfS, orf4, orp, o?f6, otf7, orfS, oip, orflO (Canadian patent application CA21081 13 and 
Jensen, S.E et al (2000) Antimicrob. Agents Chemother 44:720-6) orfl 1, orfl2 (Li, R.N et al 
(2000) J. Bacterid 182:4087-95), orfl 3, orfl4, orfl 5, orfl6, orfl7, or orfl 8 (patent application 
PCT/GB02/04989). 

The picsCTit invention also contemplates a 5. clavuligerus micororganism comprising a 
15 combination of one or more disrupted or deleted cvm6para or cvm7para genes, optionally in 

combination with other disrupted or deleted 5S genes previously disclosed, together with vectors 
comprismg orflpara, orfSpara, otffpara or orfSpara genes, optionally in combination with otijer 
clavulanic acid biosynthetic genes (selected from the genes orfltoorflS) previously disclosed. 

Polynucleotides of the invention can be isolated by conventional cloning methods, such as 
20 PCR or library screening methods, using the sequences disclosed herein and in Mosher et al 
(1999) supra, W098/33896, Canadian patent application CA21081 13, Jensen, S.E et al (2000) 
supra), Li, R.N et al (2000) siqrra and patent appUcationPCT/GB02/04989. as indicated 
hereinabove. Examples of such cloning methods are described in, for example, Sambrook, J et al 
(1989) Molecular cloning, a laboratory manual (2nd Ed) Cold Spring Harbor Laboratory, Cold 

25 Spring Harbor, New York. 

Polynucleotides comprising individual open reading frames can be isolated and ligated 
together into vectors in a variety of combinations as defined hereinabove using techniques well 
know in tiie art. The choice of vector will depend on the ftmction being carried out, for example 
cloning, expression, gene inactivation or transfer into S. clavuligerus eg. for gene replacement. In 
30 all cases a variety of vectors are available to the skilled person and are well known in the art. For 
example such vectors are known from Sambrook, J et al (1989) supra for general cloning vectors 
Hopwood, D.A et al (1985) supra for Streptomyces vectors, Paradkar and Jensen (1995) supra, 
Mosher et al (1999) supra and W098/33896 supra for gene disruption and gene replacement 
vectors and CA21081 13 supra for vectors suitable for expression of genes in Streptomyces 
35 clavuligerus. However the choice of vector is not limited to just those disclosed in these sources. 

Further, in the case of the gene combinations involving the oippara, orfSpara, orf4para, 
orfSpara and orf6para genes the skilled artisan would be able to design suitable DNA constructs 
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to ensure that each open reading frame is suitably positioned relative to a transcriptional promoto:, 
whether this be the native promoter or a heterologous promoter tiiat also functions in the 
Streptomyces clavuligerus backgroxmd, or indeed other regulatiry sequence, in such a manner fliat 
expression of each open reading frame is optimally achieved. 
5 Subsequent manipulation of the polynucleotides, in particular with respect their 

introduction into the Streptomyces clavuligerus background, can be carried out according to 
standard methods as disclosed in, for example, Hopwood, DA et al (1985) supra. Disruption of 
gene sequences, and subsequent gene replacement, can be carried out according to the method of 
Paradkar, A.S and Jensen, S.E (1995) supra. Deletion of gene sequences can be carried out using 
10 well established techniques, for example that disclosed in W098/33896. 

Microorganisms of the invention can be prepared from Streptomyces clavuligerus strains 
including, but not limited to, Streptomyces clavuligerus ATCC 27064 (American Type Culture 
Collection, Manassas, Virginia, USA), alternatively available as NRRL 3585 (Northem Regional 
Research Laboratory, Peoria, Illinois, USA). For example mutant strains of Streptomyces 
1 5 clavuligerus can also be used including those prepared by genetic engineering techniques, or those 
prepared by strain in^rovement methods. Examples of such strains mclude Streptomyces 
clavuligerus strains 56-1 A, 56-3A, 57-2B, 57-lC, 60-1 A, 60-2A, 60-3A, 61-1 A, 61-2A, 61-3Aor 
61-4A as disclosed in W098/33896. 

Thus in another aspect tiie invention relates to a process for improving clavulanic acid 
20 production in a suitable microorganism comprising isolating a polynucleotide as described 

hereinabove, manipulating said polynucleotide, introducing the maripulated polynucleotide into a 
said suitable microorganism and fermenting said suitable microorganism under conditions 
whereby clavulanic acid is produced. Manipulation of said polynucleotide may be by means of 
disrupting or deleting gene sequences in the case of cvmpara genes, optionally together with cvm 
25 genes, or by inserting into vectors suitable for expression in the case of or^ara genes, optionally 
together with orf genes. 

Preferably the suitable microorganism is Streptomyces clavuligerus. 
Such fermentation methods are well known in the art, for example the methods disclosed 
in UK Patent Specification No. 1,508,977. Methods for using clavulanic acid in the preparation c 
30 antibiotic formulations are similarly well known in the art. 

Examples 

Example 1 - Materials and Methods 

In the examples all methods are as described in Sambrook, J. et al supra, Hopwood, DA. 
35 et al. (1985) supra and Kieser, T et al. (2000) Practical Streptomyces Genetics, unless 

otherwise stated. Transformation methods can also be found in Paradkar, A.S. and Jensen, S.E 
(1995) supra. 
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1.1 Bacterial strains, media and culture conditions. 

Streptomyces clavuligems NRRL 3585 was obtained from the Northern Regional 
Research Laboratory (Peoria, IL). S. clavuligems was maintained on eitha: MYM agar 
5 (Stuttard, C. (1982) J. Gen. Microbiol. 128:1 15-121) or ISP Medium #4 agar plates (Difco, 
Detroit, MI). 

Cultures for the isolation of chromosomal DNA were grown on a 2:3 mixture of 
trypticase soy broth and YEME as described by Alexander et al.(1998) J.Bact. 180:4068-79. 
Cultures for analysis of the production of clavulanic acid and other clavam metabolites were 
10 grown on Soy medium (European Patent 0349 121) unless otherwise stated. All liquid 
cultures were grown at 26°C on a rotary shaker at 250 rpm. 

Manipulation of DNA in Escherichia coli was done using strain XL-1 Blue (Stratagraie, 
La JoUa, CA). E. coli cultures were maintained on LB agar medium and grown in liquid 
culture in LB medium at 37°C (Sambrook, J et al {\9%9)supra). Plasmid-containing cultures 
1 5 were supplemented with appropriate levels of antibiotic. 



1.2 DNA manipulations. 

Standard DNA manipulations such as plasmid isolation, restriction endonuclease 
digestion, generation of blunt-ended fragments, Ugation, "P labelling of DNA probes by nick 
20 translation and E. coli transformation were carried out as described in Sambrook J et al (1 989) 
supra). Plasmid and genomic DNA isolation from Streptomyces spp. was conducted as 
described in Kieser, T et al (2000) supra. Construction of a Iftrary of 5. clavuligerus genomic 
DNA fragments in the cosmid pWE15 was carried out according to the manufecturer's 
instructions (Stratagene). 
25 Southern analysis of S. clavuligerus DNA fragments was conducted at high stringency as 

described by Sambrook, J et al (1989) supra. Hybridization membranes were washed twice 
for 30 min at 2xSSa0.1% SDS and once for 30 min at O.lxSSaO.1% SDS, all at SS^C. 
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Example 2 - Preparation of the paralogue cluster DNA fragment 



2.1 Cloning and nucleotide sequencing of the orf4 paralogue 

A strong and a very weak hybridization signal was consistently obsCTved on Southern 
blots of ATcoI-digested S. clavuligerus chromosomal DNA when probed with the orf4 gene 
(CA21081 13). The strong signal corresponded to the orf4 gene, but the identity of the gene 
35 that gave rise to the very weak signal was unknown. Therefore it was decided to clone this 
gene. To this end, Ncol fragments from 5. clavuligerus DNA of approxhnately 4-51d> in size 
were ligated into Ncol digested pUC120 (Vieira, J and J Messing (1987) Methods Enzymol. 
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153, 3-1 1) and screened using a colony blot hybridisation method and employing the orfA 
gene as a probe. Plasmid DNA was isolated from potential positive clones and confirmed to 
carry a 4.3 kb Ncol fragment. A representative clone, p04H-4, was chosen for further study. 
The sequencing of the 4.3 kb Ncol fragment was carried out Analysis of the sequence 
5 generated identified three genes, one which had homology to orfA and was called orf4par. 
The two other genes present were found to have homology with or/5 and cvm6 and were 
therefore called 07f6par and cvm6par. This result suggested that tiiis region of DNA may 
contain a cluster of genes with paralogues in either the clavxilanic acid biosynthetic gene 
cluster or the cvm clavam biosynthetic gene cluster. 

10 

2.2 Sequencing of DNA flanking the 43 kb Ncol fragment containing orf4par 

Sequence analysis of DNA flanking the 4.3 kb Ncol fragment containing oy/4par was 
achieved by identifying 2 cosmid clones containing the orfApar gene. The two cosmid clones 
containing orf4par,14E10 and 6G9, were isolated from a 5. clavuligerus pWElS (Promega, 
15 Madison, WI) cosmid bank that had been probed with a 0,46Kb SaR fragment that is internal 
to the orf4par gene. These cosmids have been partially mapped using a series of digestions 
and Southern hybridization experiments (In. Nucleic acid techniques in bacterial systematics. 
Ed. Stackebrandt, E and Goodfellow, M (1991) John Wiley and Sons, p205-248). Digestion 
of both cosmids with EcdSl, Kpnl and Nrul suggest that the insert size of 14E10 is 
20 approximately 45 kb and 6G9 is approximately 40 kb. These two cosmid inserts have about 
20 kb of overlapping DNA and provided DNA for sequence analysis of regions upstream and 
downstream of the 4.3 kb Ncol fragment containing orfApar. 

DNA sequence information was generated essentially as described in CA2108113. The 
DYEnamic ET Terminator Cycle Sequencing Kit (Amersham Pharmacia, Bale d'Urfe, 
25 Quebec, Canada) was used. Approximately 1 3 .3 kilobases of contiguous DNA sequence was 
generated. The nucleotide sequence of the 5. clavuligerus chromosomal DNA generated in 
these experiments is shown in SEQ ID No: 16. 

A number of open reading frames were identified which displayed significant homology 
with the previously described otf2, orf3, orf4 , and orf6 (CA21081 13). These genes have 
30 been located within the genome in relation to each other, and are found to be nearly in flie 
same organisation as that of flie genes within the clavulanic acid cluster. The genes orflpar , 
orfSpar and orf4par are adjacent to each other and in the same orientation as their 
counterparts orf2, orf3 and orf4. However casl is not downstream otorfipar as cas2 is to 
Off 4 in the clavulanic acid pathway but is instead within the clavam cluster (Mosher et al 
35 (1 999) supra). Another difference between the clavulanic acid cluster and the paralogue 

arrangement is that Off6par is end-on-end to orf4par, and so is not in the same orientation as 
0ff2par-4par, whereas orfS is in the same orientation as 07/32-4 in the clavulanic acid cluster. 
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Snprisingly the gene immediately upstream of orf6par^ was found to be a gene that had a 
paralogue in the clavam and not tiie clavulanic acid cluster. This gene was called cvm6par, as 
it is a paralogue of &e cvm6 gene found clustered with casl ^osher et al (1999) supra). The 
cvm6 gene encodes an enzyme that is involved in clavam production (orfdwn3 in 
5 W098/33896). 

Located adjacent to cvm6par is a new gene called €vm7par. This gene shows homology 
to cvm7, a gene that is located upstream of cvm3 in tiie clavam cluster (further described 
hereinbelow). Upstream of cvml is a new open reading frame, believed to encode a sensor 
kinase. It encodes an polypeptide of 555 amino acids and shows good similarity to sensor 
10 kinase domains of two component response regulator genes. 

2.3 Functional analysis of the open reading frames 

Computer analysis of tfie DNA sequence shown in SEQ ID No. 16 predicts the presence 
of 7 open reading frames. A description of each gene is shown in Table 1 . 

15 

Table 1 



Orf Designation 


Homology 
(blast P) 


orf2par 


acetolactate synthase 

(67% identity to off2 carboxyethyl arginine 
synthase CEAS; 


orftpar 


asparagine synthetase 

(49% identity with orf3 P-lactam synthase 

BLS; 


orf4par 


amidinohydrolase 

(71% identity with orf4 amidinohydrolase 
PAH) 


orfCpar 


ornithine acetyltransferase 

(47% identity with o?f6 omithine acetyl 

transferase OAT) 


cvm6par 

i 


aminotransferase 

(66% identity with cvm6 acetylomithine 
aminotransferase) 


cvm7par 


Transcriptional regulator 

(33% identity with cvmThomologue) 
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Sensor Kinase 


Sensor Kinase 




47% identity with 2 component system from 




S.coelicolor A3 (2) 



To assess the possible roles of these ORFs in the biosynthesis of clavulanic acid and/or 
clavams produced by 5. clavuligerus, insertional inactivation mutants were created by gene 
replacement essentially as described by Paradkar and Jensen (1995) supra. 

5 However, in order to definitively define the phenotype of these disruptions, it was considered 
important to disrupt orfSpar, orfApar^ orfSpar and cvmCpar not only in wild type 5. 
clavuligertiSy but also in strains of S, clavuligems that were already defective in the 
expression oforf3, orf4, orfCj and cvm5 respectively. The orf3,4 and 6 mutants were made as 
described in United States Patent No. 6,332,106 and the cvm6 mutant made as described in 

10 W098/33896. 



Example 3 — Analysis orf 4, and orf4par 

3>1 Construction of orf4 mutants 

Mutants disnq>ted in orf4 (pah) were made as described in United States Patent No. 
15 6,332,106. 



3.2 Construction of or/^p^r mutants 

p04H-4 (4.3Kb Ncol fragment cloned into the Ncol site of pUC120 (Vieira and Messing 
1987 supra) was digested with Kpnl (one site in the cloned fragment and one site in the 
20 vector) and religated to reduce the size of the orf^/7ar-bearing DNA insert to 1 .TH? thereby 
generating the plasmid p4K-l . The orf4par gene within p4K-l was disrupted by digestion at 
its centrally located EcoNI site and insertion of the apramycin (opr) resistance gene cassette 
from pUC120apr (Trepanier et al. (2002) Microbiology 148: 643-656) after both fragments 
had been made blunt by treatment with flxe Klenow fragment of DNA polymerase 1. The 

25 KpnVNcol insert carrying the disrupted orf4par gene was then inserted into the EcoSl site of 
pDASOl after blunting the ends of both insert and vector. pDA501 is a shuttle vector prepared 
by fusing the Streptomyces plasmid pIJ486 (Kieser, T et al (2000) supra) to the Exoli 
plasmid pTZ18R (Stratagene) by means of their EcdKl and BaniHl sites. The resulting 
construct, 6pDAB, was xised to transform SJividans TK24, and finally wild-type S. 

30 clavuligems to thiostrepton (thio at 5|xg/ml) and apramycin (apr at 20jig/nd) resistance. 

Gene replacement mutants were generated as described by Paradkar and Jensen (1995) 
supra. 
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.3 Construction of orf4/orf4par mutants 

An approach was undertaken to generate the double mutant by transfomaing protoplasts 
of the orfApcff (aprO mutant wilh the orf4 (thio") disruption construct (Aidoo et al. (1994) 
Gene. 147:41-6). Protoplast preparations from orfApar mutants, wore transformed with the 
5 orf4 disruption construct isolated from 5. lividans. TransfomMnts were selected on 

thioslrepton at 5|ig/ml and hygromycin (hyg) at SO^ig/ml. Primary transformants were put 
through two rounds of sporulation under non- selective conditions in order to generate gene 
replacement mutants as described by Paradkar and Jensen (1995) supra. 

10 3.4 Fermentation analysis of orfi. orfipar and orf4/orf4Dar mutants 

To test the effect of disrupting oTf4. orf4par and orf4/4par on davulanic acid 
biosynthesis, spores from each isolate were inoculated into 20ml of seed medium (European 
patent 0 349 121) and grown for 2 days at 26°C with shaking. 1ml of flie seed culture was 
then inoculated into a final stage Soy medium (European Patent 0349 121) and grown at 26°C 
15 for up to 3 days with shaking. Samples of final stage broth were withdrawn after three days 
growth and assayed for clavulanic acid productivity by HPLC (Mosher et al (1999) suprd) 
and/ or using an imidazole derivatised colorimetric assay OBird, A.E. et al (1982) Analyst, 
107: 1241-1245 and Foulston, M. and Reading, C. (1982) Antinaicrob. Agents Chemother., 
22:753-762). 
20 Fermentation analysis of orf4 disruptant 

The orf4 disruptant was fermented in Soy medium and con5>ared to wild type S. 
clavuligerus for production of clavulanic acid. After 72hrs growth, accumulation of 
clavulanic acid was reduced by 71%. 

From these results it can be concluded that orf4 is required for efficient production of 
25 clavulanic acid as elimination of this gene by disruption causes a reduction in clavulanic acid 
levels. 

Fermentation analysis of orfipor d isruptant 

Mutant 5pDA defective in tiie OTf4par gene was fermented in Soy medium and 
compared to wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, 
30 accumulation of clavulanic acid was reduced by 12%. 

From these results it can be concluded that, like oif4. orf4par contiibutes to 
clavulanic acid biosynthesis as elimination of this gene by disruption causes a reduction in 
clavulanic acid levels. 

Fermentation analysis of orf4/orf4Dar disruptants 
35 When mutants A4-A1 and 3A3-A3, defective in both copies of the orf4 genes were 

grown in Soy medivim production of clavulanic acid could not be detected. 
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From these results it can be concluded ttiat undor the conditions tested, both genes, 
orf4 and orf4par, contribute to clavulanic acid biosynthesis as the double disruption, results in 
a mutant imable to make clavulanic acid. 

3.5 Southem Analysis 

The orf4, orf4par and orf4/4par mutants were further characterised by Southem 
analysis. The results confirmed that in these mutants the chromosomal copies of the relevant 
genes had been disrupted as expected. 

Example 4 - Analysis of orf6 and oifSpar 

4.1 Construction of orf6 mutants 

orf6 mutants were made as described in United States Patent No. 6,332,106 

4.2 Construction of or/Spar mutants 

The orfSpar gene was disrupted by introduction of a neomycin resistance gene (neo*) into 
the ii^rll site, approximately midway through the coding region. La order to achieve this 
p04H-4 was digested with KprH to remove orf4par and self ligated to give p5K-6. p5K-6 was 
digested with RsrU and the neomycin resistance gene, released from pFDNeo-S (Denis and 
Brzezinski (1992) Gene 111:115-118.) as a PjA/jBcoRI fragment, was inserted after both 
fragments had been made blunt by treatment with the Klenow fragment of DNA polymerase 
I. The construct pNeo5K-6A was obtained which has the neo^ gene in the same orientation as 
the orf6par gene. 

A shuttle vector called pNeo5K-6Atsi#14 was constructed by inserting pU486 as a 

6.2 Kb fragment linearised with BgUL, into the Bamm polylinker site of pNeo5K-6A. The 
shuttle vector was used to transform S. lividans TK24 and finally S. clavuligerus WT to 
thiostrepton (5jxg/ml) and neomycin (50|ig/ml) resistance. Primary transfonnants were 
subjected to two rounds of sporulation imder non- selective conditions in order to generate 
gene replacement mutants as described by Paradkar and Jensen (1995) supra, 

4.3 Construction of orf6/orf6par mutants 

orf6/orf6par double mutants were generated by transfonning protoplasts of the orf6par 
(neoO mutant with the orf(J(apr') disruption construct (Mpsher et al (1999) supra). Protoplast 
preparations from orf6par mutants, were transformed with the orf6 disruption construct 
isolated from SMvidans. Transformants were selected on apramycin (apr) at 50ng/ml. Primary 
transformants were put through two rounds of sporulation under non- selective conditions in 
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order to generate gene replacement mutants as described by Paradkar and Jensen (1995) 
supra. 

4.4 Fermentation of orf6, orfSpar and orf6/orf6par mutants 

To test the effect of disrupting 07/d, orfCpar and orfS/orfSpar on clavulanic acid 
biosynthesis, spores from each isolate were tested as previously described in section 3.4. 
Fermentation Analysis of orf6 mutants 

Mutant 6-1 A defective in the orf6 gene was fermented in Soy medium and conapared 
to wild type S. claviiligerus for production of clavulanic acid. After 72hrs growth, 
accumulation of clavulanic acid was reduced by 57%. From these results it can be concluded 
that orf6 is required for efficient production of clavulanic acid as elimination of this gene by 
disruption causes a reduction in clavulanic acid levels. 
Fermentation Analysis of orfSpar mutants 

Mutant 14-2B(2) defective in the orfSpar gene was fermented in Soy medixmi and 
compared to wild type 5. clavtdigervs for production of clavulanic acid. After 72hrs growth, 
accumulation of clavulanic acid was reduced by 27%. From these results it can be concluded 
that, like orf6, off6par contributes to clavulanic acid biosynthesis as elimination of this gene 
by disruption causes a reduction in clavulanic acid levels. 
Fermentation Analysis of orf6/orf6par mutants 

Two separate mutants defective in both orf6 and otf6par were fermented in Soy 
mediimi and compared to wild type S, clavuligerus for production of clavulanic acid. After 
72hrs growth, accumulation of clavulanic acid was reduced by an average of 65%. 

From these results it can be concluded that both orf6 and orf6par are necessary for 
efficient production of clavulanic acid since disruption of either copy of the gene causes a 
reduction in clavulanic acid production. Inactivation of both copies of the gene caused a 
further decrease, but not a complete loss of clavulanic acid producmg ability. 

4.5 Southem Analysis 

The orf6 , orfOpar and orf6/orf6par mutants were further characterised by Southem 
analysis. The results confumed that in these mutants the chromosomal copy of the relevant 
gene had been disrupted as expected. 

Example 5 - Analysis of cvm6 and cvm6par 
5.1 Construction of cvmtf mutants 

Construction of mutants disrupted in cvmd has aheady been described in 
W098/33896 (cvwd is orfdwnS). 
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5.2 Construction of cv?n(5par mutants 

A 1 .7 Kb San fragment containing cvm6par was released from p04H-4 and ligated 
into pUCl 1 8 at the SaK site. The resulting plasraid was digested with Ecotfl to release a 140 
bp fragment internal to cvm6par. In place of this fragment, the neomycin resistance gene from 
pFDNeo-S, released as an EcoRJJPstl fragment, was ligated into cvm6par after both 
fragments had been made blunt by treatment with the Klenow fragment of DNA polymerase 
I. The neo^ marker was inserted in the same orientation as cvm6par. The neomycin containing 
San fragment was released with EcdRl and inserted into the shuttle vector pUWL-KS 
(Weimeier, U.F (1995) Gene 165:149-150.) at the EcoBl site. The construct was named 
pNeoSall.TU. 

The plasmid pNeoSall .7U was used to transform SJMdans TK24, and finally 
S.clavuligerus wild type. The resulting cvm6par::neo transformants were selected on MYM 
medium with 50jig/ml neomycin and 5jig/mi thiostrepton and then subjected to two rounds of 
sporulation under non- selective conditions to give double cross-over mutants. 

5.3 Construction of cvm6/cvm6par mutants 

The construct pNeoSalLTU isolated from SJMdans TK24 was also used to transform 
the cvind mutant 56-3A, where the apr^ cassette was inserted into cvm6 in the same 
orientation as the gene. Transformants were grown on MYM medium with SO^g/ml 
neomycin and 5^g/ml tiiiostrepton. The mutants were put through two rounds of sporulation 
under non- selective conditions as described above and double cross-over mutants were 
isolated. 

5.4 Fermentation of cvm5. cvmSpar and cvm6/cvm6par mutants 

To test the effect of disrupting cvm6, cvm6par and c\m6/cvm6par on ^-lactam 
biosynthesis, spores from each isolate were tested as previously described in section 3.4. 
Fermentation Analvsis of cvm6 mutants 

It was reported in W098/33896 that mutants 56-1 A, 56-3A, 57-lC and 57-2B 
defective in the cvm6 gene produced elevated levels of clavulanic acid (125-141% of the 
control strain) and greatly reduced levels of clavam-2-carboxylate and 2- 
hydroxymethylclavam when cultured in Soy medium. 

These results suggest that the cv7n6 gene is reqxaired for efficient production of the 5S 
clavams. Disruption of cvm6 not only results in a reduction in clavams but also a 
simultaneous increase in clavulanic acid. 

Fermentation Analvsis of cvm6par mutants 
Mutants 3A1, 3A2, 2A-6, 2B-1 and 2B-2 defective in the cvmSpar gene were fermented in 
Soy medium and con:q)ared to wild type 5. clavuligerus for production of P-lactam 
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metabolites. After 72hrs growth, accumulations in clavulanic acid were increased by 6-1 1%. 
Production of clavam-2-carboxylate and alanyl clavam was abolished and levels of 2- 
hydroxymethyl clavam reduced by 50-85%. 

These results suggest that like cvm6 tiie cvm6par gene is required for efficient 
production of the 5S clavanos. Disruption of cvm6par not only results in a reduction in 
clavams but also a simultaneous increase in clavulanic acid. 
Fermentation Analysis of cvrn6/cvm6par double mutants 

Mutants A-1, A-2, B-1, B-2, C-1 and C-2 defective in bofli the cvm6 and cvm6par 
genes were grown in Soy medium and compared to wild type S. clavuligerus for their 
production of p-lactam metabolites. Production of clavulanic acid was increased by 12-27%, 
production of alanyl clavam and clavam-2-carboxylate eliminated and levels of 2- 
hydroxymethyl clavam reduced by 70-83%, 

These results indicate that, like the cvm6 and cvm6par single mutants, tiie cvm6lcvm6par 
double mutants produced elevated levels of clavulanic acid and both genes are required for 
the efficient production of 5S clavams. 

5 .5 Southern Analvsis 

The cvm6, cvm6par and cvm6/cvni6par mutants were further characterised by Southern 
analysis. The results confirmed that in these mutants the chromosonnial copies of the relevant 
20 genes had been disrupted as expected. 

Example 6 - Analysis of orf3 and or/Spar 

6.1 Construction of Qr/3 mutants 

Mutants disrupted in orf3 wctc made as described in United States Patent No. 6,332,106. 

25 

6.2 Construction of orBpar mutants 

The plasmid p5 JEcdRI ref (pJOE based hyg) was used as the disruption template for 
orfSpar. The insert in this plasmid is approximately 5.7kb and includes part of cym6par, all of 
orf6par, orf4par, otfSpar and part of orflpar all carried within the plasmid pJOE829 O^ieser, 

30 T et al. (2000); Aidoo et al (1994) Gene. 147:41-6). The disrtqjtion vector was constructed by 
Mgation of a thiostrepton resistance cassette (Aidoo et al. supra) into FsA digested 
p5.7£coRI. A unique F^el site is located within the insert 507 bp from the start of orfSpar. 
The correct construct was obtained and used to sequentially transform SAividans TK24 and 
then 5. clavuligerus wild type. Primary transformants were selected on Aiostrepton (5^g/ml) 

35 and hygromycin (25^g/ml). The mutants were put through two rounds of sporulation under 
non- selective conditions as described above and putative double cross-over mutants were 
isolated. 
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6.3 Construction of or/3/or/3pflr mutants 

The orfSpar disruption cassette described in section 6.2 was isolated from SMvidans 
TK24 and used to transform o/^::apra mutants. Transformants were selected on MYM 
medium containing tfaiostrepton (S^g/ml) and hygromycin (2S}ig/ml). The mutants were put 
through two rounds of sporulation without selection and double crossover mutants isolated as 
previously described. 

6.4 Fermentation Analvsis of orf3, orBpar and orf3/nrf^p nr mntftntg 

To test the effect of disrupting orfi, orf3par and oifS/oifSpar on clavulanic acid 
biosynthesis, spores from each isolate were tested as previously described in section 3.4. 
Fermentation Analvsis of orf 3 mutants 

Mutants Ap3-1, Ap3-2 and Ap3-3 were fermented in Soy medium and conqpared to 
wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, accumulations 
in clavulanic acid were reduced by 31-71%. 

From these results it can be concluded that orf3 is required for efiBcient production of 
clavulanic acid as elimination of this gene by disruption causes a reduction in clavulanic acid 
levels. 

Fermentation of orfSpar mutants 

Mutants 3 A-1 and 3 A-2 were fermented in Soy medium and compared to wild type iS. 
clavuligerus far production of clavulanic acid. After 72hrs growth, accumulations in 
clavulanic acid were reduced by 9%. 

From these results it can be concluded that orf3par is required for efficient production 
of clavulanic acid as elimination of this gene by disruption causes a reduction in clavulanic 
acid levels. 

Fermentation of orf 3/orf3par mutants 
Clavulanic acid biosynthesis was completely abolished when mutants 11-1,11-2, 2-1 and 2-2 
defective in both copies of the o7f3 gene were grown in Spy medium and conopared to wild 
type S, clavuligerus . 

These results demonstrate that under the conditions tested, both genes, orf3 and 
orfipar, contribute to clavulanic acid biosynthesis as the double disruption results in a mutant 
unable to make any clavulanic acid. 

6,5 Southem Analvsis 

The orfi, orfSpar and orf3/orf3par mutants were fiirther characterised by Southem 
analysis. The results confirmed that in these mutants the chromosomal copies of the relevant 
genes had been disrupted as expected* 
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Example 7 - Analysis of orf2 and orflvar 
7.1 Construction of org mutants 

Mutants disrupted in orfZ wctc originally made as described in United States Patent 
5 No. 6,332,106. These original orf2 mutants were subjected to a second round of gene 

replacement to remove the apraniycin resistance gene and replace it wifli a sinq>le fiameshift 
mutation. The plasmid construct used to create the original otf2 mutant consisted of a 2.1 Kb 
EcoKUBgni fragment of S. cUmdigerus DNA carried on a pUCl 19/pn486 shuttle vector, 
with the orf2 gene disrupted by insertion of an apramycin resistance gene cassette into a 
10 centrally located Nod site (United States Patent No. 6,332,106). The disruption plasmid 
construct used in tiie second ro\md of mutation was derived from Has origfaial disruption 
plasmid by digestion with Notl to release flie apramycin resistance gene cassette, treatment 
with the Klenow fragment of DNA polymerase I to fill in the overhanging ends, and then re- 
ligation to circularize the plasmid. The resulting plasmid construct carries the entire or/2 gene 
15 but with a frameshift introduced at the location of the destroyed Ncol site. The construct was 
used to sequentially transform SMvidans 1K24 and flien the original S. clavuUgerus orfl 
mutant Primary transfonnants were selected on Ihiostrepton (5jig/ml) and then subjected to 
two rounds of sporulation undo: non-selective conditions. Putative double cross-over mutants 
were identified based on their loss of apramycin resistance . 

20 

7 .2 Construction of arf2par mutants 

orftpar mutants were generated using a PCR-based targeting kit known as 
REDIRECT (trade Mark of Plant Bioscience Limited, Norwich, U.K). The plasmids pU790 
and pIJ773, and the host strain E. coli BW251 13 were supplied as part of the kit For this 
25 particular application, a pair of oligonucleotide primers, 

KTA14: 5'-CCATCCCGGCGCCCGTCCGATGCGAAGGAGATCTCCATGATTCCGG- 

GGATCCGTCGACC-3' and 

KTA15: 5'-CGGGGCCGGGCATGGTGAACTCGTCCTCCACGGTGGTCATGTAGGC- 
TGGAGCTGCTT-3 *, designed to disrupt the otf2par gene by msertion of an apramycin 

30 resistance gene, were synthesized. The orftpar disruption cassette was generated by PGR 
using these two primers with the plasmid pn773 as template. PGR conditions used wereas 
described in the user instructions except that no dimethylsulfoxide was used. The orf2par 
disruption cassette was flien introduced by electrotransformation into E. coli 
BW25 1 13/pU790 which had been previously transformed with the OTf2par bearing cosmid 

35 14E10 (described hereinabove). Cosmid DNA was isolated from transfonnants after 

overnight growth at 3rc to promote loss of the pD790 plasmid and analyzed to confirm that 
the orftpar gene had been disrupted. orf2par disnq)ted cosmid DNA was flien transfenred into 
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wild type S, clavuligerus by conjugation. Conjugation was carried out as described by Kieser, 
T et al (2000) supra except that AS-1 medium (Baltz, R. H. Genetic recombination by 
protoplast fiision in Streptomyces. Dev. Ind. Microbiol 21 (1980) 43-54) supplemented witii 
apramycin at 50 |ig/ml was used for recovery of transconjugants. Apran^cin resistant S. 
clavuligerus transconjugants were subjected to one round of sporulation under non-selective 
conditions in order to generate gene replacement mutants as described by Paradkar and Jensen 
(1995) supra. 

7.3 Construction of orf2/orf2var mutants 

The PCR-based targeting procedure used to generate the orfZpar mutants (section 7.2) 
was also used to generate or£2/or£2par double mutants. In this case the orfZpar disrupted 
cosmid DNA was conjugated into the or£2 mutants described above (section 7.1) rather than 
into the wild type strain. Apramycin resistant S. clavuligerus transconjugants were subjected 
to one round of sporulation under non-selective conditions in order to obtain unigenomic 
mutant spores that had undergone gene replacement as previously described. 

7.4 Fermentation analvsis of orfl. orflpar and orf2/orf2par mutants 

To test tiiie effect of disrupting orf2, orfZpar and orf2/2par on clavulanic acid 
biosynthesis, spores from each isolate were tested as previously described in section 3.4. 
Fermentation Analvsis of orfl mutants 

Mutants defective in the orfl gene were fermented in Soy medium and compared to 
wild type 51 clavuligerus for production of clavulanic acid. After 72hrs growth, accinnulations 
in clavulanic acid were reduced by 95-98% (Jensen et al (2000) supra. 

From these results it can be concluded that off2 is required for efficient production of 
clavulanic acid as elimination of ibis gene by disruption causes a severe reduction in 
clavulanic acid production. 
Fermentation analvsis of orflpar disruptant 

Mutants defective in the orfZpar gene were fermented in Soy medium and compared 
to wild type S. clavuligerus for production of clavulanic acid. After 72hrs growfli, 
accumulation of clavulanic acid was reduced by 10-30%. 

From these results it can be concluded that, like orfZ, orfZpar contributes to 
clavulanic acid biosynthesis as elimination of this gene by disruption causes a reduction in 
clavulanic acid levels. 

Fermentation analvsis of orfZ/orfZpar disruntants 

Mutants defective in both or£Z and orf2par were fermented in Soy medium and conq)ared to 
wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, no clavulanic 
acid production could be detected from the strains contain the orfZ and orfZpar mutations. 
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These results draionstrate Ihat under the conditions tested, hoQx genes, or£2 and orfZpar, 
contribute to clavulanic acid biosynthesis as tiie double disnq)tion results in a mutant unable 
to make clavulanic acid. 



5 S. Southern Analysis 

The orf2, orftpar and orJ2/2par mutants wore further characterised by Southern analysis. 
The results confirmed that in these mutants the chromosomal copies of the relevant genes had 
been disn;q>ted as expected 

10 Example 8 -Analysis of cp»i7 an d cvmJoar 

Sequence analysis had identified two additional genes in the paralogue cluster that 
did not have obvious paralogues in ei&er the clavulanic acid or cvm gene clusters. It was of 
interest to determine if dthar of these genes was aparalogue to an as yet unidentified cvm 
gene. Therefore the sequence of the cvm cluster (W098/33896) was extended downstream of 

15 cvm3(oj:/t<p5inW098/33896). 

8.1 Extension of cvm cluster sequence 

The cosmid 10D7 (described in W098/33896) was digested with the restriction 
endonuclease Sacl. From this digestion a 6.8 kilobase DNA fragment containing casl and 
20 cvml was isolated and cloned into a pUCl 19 based plasmid. The resultant plasmid pCEC019 
was used as a template to generate sequence information which allowed completion of the 
partial cvm3 gene reported in W098/33896. In addition, the sequence information showed the 
presence of another open reading firame, cvm7, which was incomplete in this fi:agment. In 
order to con^lete the cvm7 gene sequence, the next adjacent Sacl firagment fi-om cosmid 
25 10D7, a 1 .9 kb fragment, was subcloned. Sequence information was obtained from the end of 
the clone which contained the remainder of the cvm7 gene, up to the point where the start 
codon for the cvm7 gene could be identified. In total, this resulted in the generation of a 
fiather approximately 3.9 kb of new DNA sequence which is described in Sequence ID 
No.17. 

30 

8.2 Sequence analysis 

The size of CVOT7 and its orientation relative to the rest of the cvm cluster is showed 
diagrammatically in fig2. Sequence homology searches demonstrated that this gene shares 
homology with transcriptional regulator genes. In addition cvm? also shared 33% identity 
35 with one of the two genes identified in the paralogue cluster that did not have any obvious 
paralogues within the known clavulanic acid or clavam biosynthetic genes. Therefore since 
cvm6 and cvm6par have been shown to be paralogues, from this sequence data it can be 
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concluded that cvm7 and cvmlpar are paralogues of genes involved in SS clavam 
biosynthesis. 

Brief description of the figures 

Figure 1 . Diagram of the paralogue cluster. The orientation of transcription is shown for each 
gene (direction of arrow) 

Figure 2. Orientation of cvm 7 in relation to published cvm cluster (W098/33 896). 
Figures. Annotated seqence of Ae paralogue cluster 



Brief description of the sequences 
SEQ ID NO: 1 cvm6para open reading firame 
SEQ ID NO:2 cvmTpara open reading fame 
SEQ ID NO:3 cvm6para polypeptide 
SEQ ID NO:4 cvm7para polypeptide 
SEQ ID NO:S cvm6 open reading frame 
SEQ ID NO:6 cvm7 open reading frame 
SEQ ID NO:7 cvml open reading frame 
SEQ ID NO:8 cvm2 open reading frame 
SEQ ID NO:9 cvm3 open reading fi^me 
SEQ ID NO: 10 cvm4 open reading frame 
SEQ ID NO: 1 1 cvm5 open reading frame 
SEQ ID NO: 12 orfZpara open reading frame 
SEQ ID NO: 13 orfSpara open reading frame 
SEQ ID NO: 14 orf4para open reading frame 
SEQ ID NO: 15 orfSpara open reading frame 
SEQ ID NO: 16 paralogue cluster 

SEQ ID NO: 17 extended cvm cluster (underlined sequence denotes new sequence ovct that 
disclosed in W098/33896 

SEQ ID NO: 18 orfZpara open reading frame (reverse conq)lement) 
SEQ ID NO: 19 orf3para open reading frame (reverse conq>lement) 
SEQ ID NO:20 orf4para open reading frame (reverse conq)lement) 
SEQ ID NO:21 cvm6 polypeptide 
SEQ ID NO:22 cvm3 polypeptide 
SEQ ID NO:23 orf6para polypeptide 
SEQ ID NO:24 orf4para polypeptide 
SEQIDNO:25 orfSpara polypeptide 
SEQIDNO:26 orfZpara polypeptide 
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Sequences 

A^TCC^ScXSGTcSsCCCCXSGGGCCX^ 

CGAAGGGCGCACCTATCTCGACGCCTOSTCGGTGCTCCKSACTGACCCAGATCG^^ 

CCG^CCQftGCAQATGCGGACACTCGGTCACTTCCAC^^ 

GCGCGCXrrCACCaACCTQGCGCCCCAGGGTCTC 

CCTGCXSOVTGGCCCGTTACTTCCACCACCGCACCGGCAGCCCGG^ 

^Ca^CCrCAOTCCGCCCGACCOSTACCACGCCGAGCTGTACCaCGGCGftG^ 
CGCCC«avCCATCGACGAGATCGGCCCCGG6CGGATCGCCGCGATGAT^^ 
TCGTCCCGCCGCCGGACTACTGGCCGCGCGTCGCCGCGCTGCTGOK^^ 
GTCACCXSCXSTTCGGCasa^CGGGGACCTCGTTCGCGGCCGAGCyVCTT^^ 

GGGCATCACCTCCGGGTATGTCCaSCACGGGGCGGTGCTCCTGACCGAGGAGGTCGCGQACGCC^^ 
GGTTCCCGATCGGCTTCACCTATACCGGTCACCCCACGGCGTGOSCCXSTCGCGCTCGCCAATCTCGAC^^ 

S^SSJ^SIS^tgLggtgggcgaccacctcgccgggcggctggcggccc^^ 

GGGGGACGTCCGGCAACTGGGCATGATGCTCGCCGTCQAGCTGGTC^ 

^S^SctcotSacgSctgcgcgaggacgcggoc^ 
cgggcggatcx3gcgcggccccccggcgggggtga 

iTCT^cScK^G^raAGCTTCGTCACGACGTCCCCGGCCTGCCGGGTCCGTCACCXSTCCAT^^ 

SotS^S^acSgcSacScggraactggagctgggccctc^^^ 

ISS5SS5S^?cS^GATCGTCTTCCGTATCTGGGGCAACTCACCACCGGGCGC^ 

SSagtcctatgtgtcccggctgcggaaactcctggccgagtgtgtgctc^^^ 

CcScCGcS^ACACCCTCGaKrrCGGCACCGAGCA(^^ 

gSSSSS^^S^SSSgSSS^cgcgggccgtgctctgc^^^ 
JS^Sa^gaoS^c^cotcgccgtccaggaggccaatcggctggagcagctcc^ 

SS^S?^ScTOTCTGcScrGGGGCGGGACGAGGA<M 

?SS?SSSctSt^g^gctcatgcaggcgc^ 

^SSS^SSS^TCGCC^G^CTGOGC^CCGATCOGGGCAAGCa^ 
?S^SJc^S^GSTS?ScGGCGTCCGa3CCGCCGTCGGCGGG^ 

?S?SSgSSSStcSggccgtt6acgcggccgqtggcggggcgogcgcgggtccc^^ 

G?^C^^S?Scc?SGTCCGCCTCCGGCTCCGTTTCCGam:CGTT^^ 

?S???S^?SSGScSSSSScCCGGCTCCGTTTCTGGCTCX3Ga5TCCO^ 

SccSSSSJcSccJ^OSCTTTCGGGTCCGTGGCGCTCCACCG^ 

SScgggggSScaggggatgcgcacosg^ 

SSSS?5^SS??^A?aTCa3CGTTCCACACCTCGGGGCGGGTGGCGT^^^^ 

qc^ccSgctcctctccgagttggagcgctcggttccggacagtgtgcgcaccgtct^ 

^SSgSSactStggccgtggacgaccgtgctgcggcatctgtacgcgatgt^^ 

^?Sqt?ggctqcSSScactcgcggaactgc^^ 

^??SScGCCCcSOTCTCCAGAaAGGCTCSTTTCACC^ 

?SSccotStStStoctggaggacatggagcgggccgac^^ 
ScSSS?cSSStgct?gtggtcaccacgcgcaccttc^ 
5^S?SSa5cctcSgtcgaccggcgcgcgccx3GGtc^^ 
cSaSS?5SS^S?ggccccggacaccctcctcgtacgggcc^ 

TCGTCCAGCTCCTCCGCTOKrrCCGGCAGGGGCTCGCaSCaKJCTGGGAGAaKSAaATCCC^ 

gtoctSSgSStcgagcgtgccgcccgccgtgcgccgggtgctcqacatctgcgcggtot 

a^^tgtStSagaccgtgctgcgccatgagggaatccosctggagaacgtccgt^ 

J^ggIScSgacgaccccgggcggctgaggttcgtgcatccgctggtccggqaggccgtctgggac^ 

AACACCCGTCGGCCCGTSTCVMARGTCCTOTTCCTCCGCGCTCGGGGaWTGGCCAO^ 

S^P?SlLYDSDVTEYCLREIJmTIDEIGPGRlAAMIGEF7MGAGGAVVPPPD^^^ 
W^RTGTWB^HFGVTPDLLWAKGITSGYVPHGAVLLTEEVADAWGETGPPIGPT^ 
^iSvS;G^ORIJ^GLPAVGDVRQLGMMIAVELVSDKTARTPI.PGGTIX^^ 
PJUjVMDIU^TJUDEVATCIiDSVLRRIAPDGRIGAAPRRG 

?&^cJ?^1^7lpsPS^°^^^ 

IS5?sI^ScVLPDGSTPEiaJIQPPGYTIJ^TBHIDANRFB^ 

i ^eISaS^S^SeSixsavetoahcclrixsrdeevmi^ijcpevqrnpi^ 

I^S^SiJ?SLAM«AAIl.RQDNGU)RWPASAPPSAGVGRGAVTVSW^ 
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AGAGAAPASASGSVSASVSGSGSGSGSAPASVPTFPPGSVSGSASVAASViU^SGHVSGPGSAFGSVAIiHRPQ^ 
VHGGAQQ1RTGQVFPTI*PPFVGRGDELRGIjIjESATSAPHTSGRVAFWGEAGSGK^ 
EDRPDYWPWTTVIjRHLYAIWPERmGPPGWIiRRAIiAELLPBVGPEP<^ 
ALAPPRSREARFTIiHDAVTOAUJRTNmEPWIMIiEm^^ 
5 AAVIIiQSTGAIU^VIJ^ALDARATGEIAG<24^ 

VLQIOiSSVPPAVI^VIjDICyVVVERSCBRRVIEl^^ 
NTRRPVSRSSAIiGAIiATV 

SEQIDNO:5 cvm6 

10 GTGCCCGGCTCCGGACTCGAAGCACTGGACCGTGCCa^CCXrr^ 

CGTGCTGACCTCGGGGTCCGGCAGCCGGOTCCGaSACACC^ 

TGACCCAGGTGGGCCACGGCCXSGGCaSAGCTGGCCCGGGTCGa^ 

TGGGGGACGATCAGCAACGACCGGGOKSTGGAGCTGGCGGCAC^^ 

CTACTTCy^CCAGCGGCGGGGCCGAGGGCAACGAGATa3CCCTGCX5GAT« 
15 CCX3CCCGTACCTGGATACTCTCCCX3CCGGTCGGCCTACCy^CGGCGTCX3GATAC^^ 

GCCTACCACCAGGGCITa3GCCCCTCCCTCCCGGACGT(^(^ 

CXSCCGGTTCCGAOSTCTICCGACTTCTGCCTCGCCGAACTGCGC^ 

CGATGATaSGCGAGCCGATO^TGGGOSCXSGTCXSGCGCCGCG^ 

CTGCACTCCTACGGCTVTCCTGCTGATCTCCGAaSAGGTGATCAa^^ 
20 COICTTCGGCXSTGGTCCCGGACATCATGGTCACCGCCMGGGCATTC^^ 

ACC^lCaSAGGCCXBTaKICGACXaAGGTCGTCGGCGACCAGTC 

CTGCGCGGTGGCCCTGGCCaU^CCTGGACATaVTCmGaKX^ 

TCGGCAAACGCCTGGCaSAGCTGAGCGATCTGCC^ 

CTGGTCGCaSACCGOSGAACCCGGGAGCCGCrrGCCGGGC^^ 
25 GCTGCGCGCCAACGGCAACGCCCTCATCGTCAACCCCCCGCTGATCT^^ 

GCCTGCGCTCCGTACTCGCCCXKACCaWKSCCGGACGGCCGGGTGCT 

SEQIDNO:6cvm7 

ATGAAGTACGACATAACCCCACCATCOSGCCTTCGGTTCGACCTCCTCGGCCCGT^ 
30 CGTGGACCTCGGCGOTCCy^CGGCAGOTaSCCCrGCTCGCCCTGCT^ 

TCATGACaSCGTOSATCrGGGGGGCCGACCCACCGTCCCGGGTCCGGGGGACGCT^ 
AAACTCCrGCACCGCC:aVTGACCGTTCCCTTCXK:CTTGTCC:ACCAG 
GGTGGACGCCGTGGTTTTCGAGACaiCGTGTCa^GGGAGTQCCGGGAATTGAGC^ 
CCGTGGCCTGGTCCGCCCTGGAGATGTGGAAGGGCACACCCATGGGCXSAGCTGCATGAT^ 

35 GCCGACCGGCTGGAAGGAATCCGGTTACGCGCGCrGGAQACCTGGTCCCAGGCGTGTCTO^ 
GGTTGCATTTO^GCTCGGCGAGGAGATCCACCXSCAATCCXSGAACTGGAAaKSCT^ 
ATCATTCCGGAOSGTCGGCGGAAGCCCTGTTGACGTATGAACGTATGCGTACCGCGGTGGCGGAGi^ 
ATCAGTCCGGAGCTCCAGGAACTCCATGGAAAGATTCTGCGCCAGGAACTCACGGAGACACCCGCCraCG^ 
CTCCCrrCA(^CGGGCGGCGGGCCCGCACGGGCCCCCGCCCCTGGCCGAAACCGGCACCCCCGC 

40 CCGAAACCACGGTGGCGGAGGAAAGCGCCGCGCCCCCCGCCCCGGCGGCGCCCGGGACCCCGCCCCCC^^ 

GTACCGCrCCCCCATCCGTCAGGGGCCGTCCCGCCGGTCACCCCGGTGCCTCCCCCGGTCCCCCGCTCGGCCCT 

AGCGGCACCCGCOSAGACCGAGGACCCGGAACCGGCGCCGCCCCCTCCCCCTCCGCCGTO 

GCGCaSTACTGCGCS^CTGaMCTGCTGCTQAC^ 

CAGGGCATCGGQAAGACCCGGCTCCTGGAGCACACCGAGCAa^CCCTQGCCGCGGGCGC^^ 
45 CTGOSTCGCCACCCTCCCGGCACaSGGCTACnXSGCCCTGG^ 

GTGACGACGGCXSACGCCGACCCCGTCGCCCS^GGCCGAGTGGCroCCG^ 
CGGACGGTGCTCGCCGCGGCGCGGCGGACCCOSCTCCTGTTGATCCTGGAG^ 
GGATGTGCTCCAGCTCCTGGTCAAACAGATCGGCCAGGCCCCCGTCATGGTCGTCGCCACCe^^ 
CCCGGGACCCCGCCGTCCGCCGGGCCGTGGGCCGCyVTCCTCCAGGC^GGa^a^CCGGCACCCr^ 
50 ACCGAGGAGCAGAGCCGGGAGCTGATCGTCTCGGTCGCGGGGGCCCCGTTCGCGCCCO^^ 

CGCCTCGGGCGGCAACCCGTTTCTGCTGCTCAGCATGGTCACAGGGGAGGACGGCACCCAGG^^ 
TCCCGTTCGAGGTGOSCGAGGTGCroCACGAGCGGCTGAGCGAATGCTCCCCGTCCACCCAGGACGTGCT 
GCCGTGCTCGGCATGAGCGTGCGCCGACaSCTGCTCACCGACATCATGTCCACGCTCGAO^^ 
CGACGCGCTCGGCAa3GGGCTGCTGCGCCACGACCGGAAC».CCGACGGAATGGTCC^ 
55 ACTTCCTGCTCGACGAOVCCCCGCCGGTOVCCCGCGCCaSCTGGCACCACCGGGTCG^ 
CAGCAGGGCQACGACa^CGCCQAGATCaK:CX3Ca^CTGTCTGGCCGa^ 
CCCCCTGCTGGCGdTGGCCGACCGGGAQCAGTCCCGCnTCrCC™^ 
CGGTCGTCGCGGCGCTGCCCa3GGACC»WK:CGGTGTCCGCOT^ 
GCGCTGATGGACGGCTATGGATCGQCCCGCGTaaAGACGTTCCTCTCCC^^ 
60 CACCCAGCCCACCGGGCTGCTGCACGTCCAGGCGCTGAGCGCGCTCACCACGGGCCXSCCAl^ 
CCGGGCTGCTGCACGAGCTGGCCGACCACGGCGGOKSACCGGAGGCCCGGTCGGC^ 
CTGTATGTGGGCGGACGGGTOSACGAAGCCCTCGCCGCXKrraKrCCAGGG^^ 
ACACCGCAGGACCXSCCGCCCCGCyvCGGCGGCGGGCACCTCCAGGACCGGCGTATCGACTTC^^ 
GCCACTGTCTCAGCGGCGACCGGATTCAGACCCAGCGCTACaSGACGGAACrCCTCCACCTC^^ 
65 GACCGGCCGTGGGACCGGGCCTTCGCCCGCTATGTGGACGCGCTCATCGCCGTCACGGAGTGC^^ 
GCTGGCCGCGCGGGCGGGGCTCGACCrCGCaSCCaSCTGCC^^ 
GCTGGGCCGAGGTCCa^CCAGGGGGCXKZACXJACAAGGGGC 
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a3GACXXrKKnXKX3CXX3TACX5CTCXaiCCTCGGCC^ 

GCGCACGATGTCXrrCCXXZCGTAaKKSAGATa^^ 

GGCTCCrCCACAGCCftCGGCACCTCCGCCGCGK3CXX3AGCACC^ 



5 SEQIDNO:7cvinl 

ATGTCCXXXrrCTCCGCCCGAGTCCCCGGCOKSTTCCXSTGTCCGCCGCGGTTCCG 
CCrTCCGGTCAGTGCCC:aVGGGGCTa3GCTGCCntK:a5AC 
CCACCATCCXSCGCCGCCGTCXSACXKZaSGGCTCACCCTGCT^ 
CnXKrraSGACXSGGCGGTOSCGGGCCGCCGGGACmGGT^ 
1 0 CXK:CTCCCyVGGGCTTGTGCXX3CGAGCCGTCCTAan^ 

GCATCGACCTGTACTACCAGCACTGGACX3GACCCG6OTGTGCCGATC 

CGCGAGGGCAAGGTCCGO^CrCGGTCTCTCCGTUSCCCTCCGCXKK:^ 

GACGGCGGTGCAGAGCGAGTGGAGCCTCTGGTCGCXSCX^GGATC^ 

TCGGGATCGTCGCTTACGCCCCTCTGGGACGGGGTTTTCTCACCGGCACCATCCXKy^CC^ 
15 GACTTCCGCCGGGGCCAGCCCCGGTTCAGCGCTCCGGCCCTCGCGCGCAACCGCT^ 
CGCGGACGGTCTGGGGCTGACCCTGGCACAGCTCGCGCTCX^ 
CGGGCACCGCGAACCOSGCCCATOTCGCGGACAATCTCXSCCGCCGCCTC^ 
GTGACGGCCGCGATCTCCCACCCGGTGTCCGGGGAGCGGTACACCCCGGCATTGCTCGCa^T^ 

20 SEQIDNO:8cvin2 

ATGTCCGTGGCATCGGCCGGTATGACGGAaaAGCAGOKS^Ga 

CGGCGTCGGCAGCGACGGCACCCCCGCGATCGACTACTTCGCCGAGGACGCGGTCTTCOT 
CCCXSGGGCAAGTCCGAGATCGCCCGGCTCTTCGACGACCTCGGGGGCACCATCCGCrCGA 
GTCAACTGGATTCTGACCGGGACCGAACTCCTCGCCGCGGAGGGCACCACCCACGGTGAGCACCGG^ 
25 GGCGGGTGACCCCGAGTGGGCCGCCGGGCGCTGGTGCACGGTCTAOSAGGTGCGGGACTTCCT 
TCTATCTGGACCCCGATTAOSCGGGCAAGGACACCGCGCGTTACCCGTGGCTGTGA 



SEQIDNO:9cvm3 

GTGACCCGGCCTCCGGGCCTTTCCGCGCACACCCACGGGTCOSTGTCCGGGAGTCTGCT^^ 
30 TCCCACCGGGGTGGTCCTGGTCy^CaSGTCCGGCCGAGGCTCCGGGGCAGCaSCCGCCCGCCATTC 
CCTCXKSTGTCGCTCGATCCGGTGCTGGTGGGTTTCCTCCOSGCCAGGTCGTCGACGACCT^ 
GGGOTTTTCTGCGTCAATGTGCTOSGCGCGGATCL^GGGCCC^ 

GGAGGTGCCGTACCGGACGACGGCCACCGGCTCCCCCGTCCTGCTCGACGCGCTCX5CGTGGTTCGACTGCX3AGGTGGCGG 
GGGAGACGGAGGCGGGCGACCACTGGTTCGTCACCGGGGCGGTGCGCGACCTCGGGGTGATCCGCGAGGGTTCC^ 
35 GTCTTCCTGCGGGGCGACTACXKSGa^CTGGGCCGGGGGCGGCGGCraX^ 
GGTCTGA 

SEQIDNO:10cvm4 

GTGGAATGCCGCATATTCGAGATCX3ACX3AACTGCCX3TTGCTGGACGGGGAGGTCCTGCX3GGACGCCCGGATCTC 
40 CATGTACGGCACGCCGAACGCCGACGGGACGAACGTGGTGCTCTGTCCGTCGTTCTTCGGCCGGGACCACACCGGGTACG 

AOTGGCTGATCGGTGCGGGGCTGCCGCTGGACACCCGGCGGTACTGCGTCGTCACCGCCGGACr 

TCCAGCTCGCCCGGCAACCACCCGTCGGGGTCCCGCTTTCCGCTGATCACTCCGCAGGACAATGTCGCGG 

GCTGCTCy^CCGAGGAGCTGGGGGTACGGGAACTGGCCCTGGTCACGGGCTGGTCGATGGGCGCGGCCC^ 

GGGCCGTGTCGCATCCGGGGATGGTGCGCCGGATCGCCCCGATCTGCGGGGCGCCGGTGAGCAGCCCGCACAGC 
45 CTGCTGTCCGGTCTGGCCGCGGCGCTGAGCGCCGAOSCCGGGGAGCGGGGGCGGAAGGCGGCXSGGCCGGGTGTTCGCCGG 

GTGGGGGACCTCGCGTTCCTTCTGGGCCCGCCGTGCCCACCGGGAGCTGGGTTTCGCCACCCGCX3AGGAGTACOT 

GCTTCTGGGAGCAGGTCraCCTCTCCGGGCCCGGCGCaSCGGATCTGCrCaVCCA 

GTGGGGGOSACACCCGGGGCCGGGGGGAGCGTCGAGGCGGCGCTGGCCTCCGTC^ 

aSCCCTGGACGTGTGTTTCGCCGTCGAGGACGAGAAGOSGGTGGCCGATCT 
50 aSGGAGTOTGGGGGCATCTOSCGGGGTCCXSGGGGGTaKSCCGCCGACOT 

CTGGACA6CCCGGTGGACGGGGGCTGA 

SEQIDNOill cvm5 

GTGAAGTCCATTCTCTTCTATCTGCCAACGGTCGGCAGTCATGCGCAGGTCCAGCGGGGT^^ 
55 GAACTACCAGAACATGCTCOTGCAGCTCACCCGGCAGGCGaVGGOSGCCGAaa^ 
CCGAGCACCACTTCCACACC6AGGGTTTCQAGGTCTCCSU^a\ACCOTAT^ 
CGGCAOVTCaSGGTCGGCCMATGGCCa^CGTCCTGCCGCTGCa^CAATCCGCr 
CGACCy^CATGACCCGGGGCCGOSCCTTCGTCGGQATCGCGCGCGGGTTC 

TGTACGGGGTCGGCGGCACCCTOTCCGACGCCGGGGAGCGGGACCGGCXSCAATCGTGCCCTCTTCGAGGAGaiCT^^ 
60 ATCATCAAGAAGGCGTGGACGACCGAGACGTTCACCCACTCCGGGGAGCAGTGGACGATCCCGGTGCCGGACCTGGAGT^ 
CCCCTACGAGGCGGTGCGCCGCTACGGCCXSGGGCCrCGACGAGAACGGCGTCATCCGCGAGGTGGGCATCGCGCCC^ 
CCTACCAGCGCCCCCACCCGCCCGTCTTCCy^GCCGTTCAGCTTCAGTGAGGACACXSTTCCGGTTCTGTGCCCGGGAG 
GTGGTGCCGATCCTGATGAAOVCCGACGACCAGATOSTCXXrCCGGCTGATGGACATCrACCGGGAGGAGG 
GGGCCACGGCACCCTGCGGCGGGGCGAGCGGGTCGGGGTGATGAAGGAOaTCCTGGTCT 
65 ACCACTGGGCGTCCCGCGGCGGaSGCTTCa^TCTTCGAGAACTGGTTCGGCCCCATGGGCT^ 
ACCGGCGAGACGGGTCCGATCGGCTOSGACTAaW^CCCTTC 
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CATCAACCGCT^TGATCGACSAAGCrcGTGGAGCGGCACGATC 
CGCACX3ATGTCaiGCTGCX3CAGCCTGGAGCTGTGGGCX:ACC^ 

SEQ ID NO: 12 oriEpara 

5 TGAGATGGCCAGGGCGGCXSAAACOKrCX^GACTGGAAGT 

aKKXK:CCITGGTGAGGGCGGCGAGCAGCGAGGTGCTC 

TGGACGAAGTCGACXSCTTCCGAAGCCGACGGCGGGGGCGTGGGAGC^^ 

GCCGTTGOKSTCGTTGTTGACXSAaSACCATGACGATCGGCAGGCCC^ 

AGTGGAAGCCGCCX5TCX3CCCGCX3ATGAGGAAGACX3GGCTCX3C<:X3GGCC^^ 
1 0 CCGTAGCCGAAGCTGGAGCAGCCCGCGGAGGTGAGGAATCCGTACGGCTGGTCGGACT^ 

GCGGAAGAAGCCGATGTCGCTGACGAAGGTGCCGTTGTCGAGGAOSGA^^ 

TGCCXSTCCrCGTACrCGGTGGGGTCGGCGAGGAATTCGGCGAC^^ 

GGGGCGAGGCCCXSAGCTCGCGTOSTCGAGCGCXSGTGACGAATT^ 

CAGCTCCGGGATCX3GGTTGACCTCGGGGGCGACCCGGACCGTGGTCT 
1 5 GGTCCTCGGCGTAGTCGTAGCCGATCGCCAGGAGGAGGTCGGCGGGGCaST^GATCTTO 

ATGCCGTCCATGTAGCCGCTGATGGCGCCGTAGTTGAGCGGGTGGTCGTGCGGC^ 

GACGACGGGGATGTTCAGCCGCTCGGCGAGGGCGCGCAGGGCGTCHSACGGCCCC^ 

CGAGGAGGGGGTTCTCGGCCrCGCG(^CCAGCTCAG03GCCTC:X3TC<^ 

GTGGCGGTGGCCaSGACCAGGGGGGCGTOSGTGGGGGTGCTO 
20 GAAGCTGGGACCCACGGGCTCGATCCGGCTGTTGAGGACGGCXSCTGTCGACGAGGT^^ 

GCTGGACXjCTGAACTTGGTCJVGCGGGCCCATCACGGCGGTGCTGTCCAGGCACTC 

TACGACTOKSACTGCGCGGCCy^GCGCXaATGACCXSAGCTGCGGTCC^ 

GATGCCGGGGCCCAGGGTCGOSAAGCACGCCTGGGGGCGGTTGGTGATCaSGGC 

TGAACTCGTGCCGGGTCAGGACGAAGTaSAGTCCITCGACerC^TCGJ^ 
25 CC!GAATACATGGTCGACa^CCGTACrGGTGAAGACGTTCCAGCAT^ 

SEQ ID NO: 13 orfBpara 

TCATACGACCACCCGGCCCTGGAGCCTGAGCCTGCXSCACCGCGTCGACGGAGCGCCG 
CCrCCGGCGGCACCGTGTCGATGACCyVCCGCGTCGTACyVGGCGCCGTGCCATGGCGCCCT^ 
30 CX3CaK3ATCCerTCGGCGAGGAGCAGTCCGGTCCACGCGCTGGTGGTGCCGGACCC^ 
GGCCACGGTCTCGGCGGGCAGCAGGCCGGAGAGGGCCTGCCGCyy\CACCCAC^ 

CGGGTTCGAGGGAGACCAGCGCGTCCAGGACCGCGCGGTCCCAGTACGGGTGGGTGGTCCACTTCCaS^^ 

AGGACGGGGGACATCTCGTTGAGGCCGTCGAAGCCCGCCATGTCGCCCGCGATCraSTCGTCGAGGGACCM 

a3TGCGCCGGTGa\TACCGCa3AGCGGGATGTaKXX3CCGTACCCGG 

35 GGTAGAGGGCGAOSAGCXKXa^CaiGGTACrCCAGGACaSTGGGG 

TCCCTGACGAGTTCXSGCCGAGTGGAGCaSGATCrCXaCTGTGCGCGGT^ 
GAACTOSTCGGACACCTCXSGTGCCCy^TaSACACGGACaSTGTCCCGGGTGCCAGGGCC^ 
OKSAGTCGATGCCGCCGGACAGGACGACGGTGGGGGCCGCCTCCCOGCCGCGCy^GCC^^ 
CGTTCGCCGACCAGGTCCACCGCCTCCCGTTCXSCaSGGCAGCGCCCGGGAGAGCGGGGGTGT 

40 GGCGGTGATGTCGGAGCCGCCGACrCCGTGCAGCAGGAGGGCXKSTCCCGGCGGGGACCCGGC^ 
GCGCGGTGTGGGTGCCGGACAGGCCCAGCX3GCCGGCCaK3CTCGTGCGCCAGGGTCTTCGC 
CCCGTCACGTCGGCGCGaVGCCACAGCGGTACCGAACCGGCGTGGTCGGTGGCCGCGACGGTCGCG^ 
' GGTGAGCAGTGCGGCGAACCGTCCGTTCAGGAGCOSGAAGGCCCCGGGGCCCCAGCGCCGCC^ 
CGGCGTOSCCGAGGGCGGCyVGAGGAGCCGCCGAGCGCTCCGGTmGCTaSGCGCGGT^ 

45 AGCCGGACCTGGCCGTCGGCGACCAGGACGGGCGGACGGCCCAGGGTa\CGGCCGTTCCGCTCCAGAGCG^ 
GCCGTCGTGCACGGGGACATGGGTCCCGCGGACGGCGAAGOSGGGTGCGCTGCCGGGTTCGGAGTGACC^ 
CGCCXSGGGCGGCCCTCGGTGCCGATGOKIACCCGGAATCCXSTACAC^ 

SEQ ED NO: 14 orf4para 

50 CTACCCCCACCGCTGCCCGGCGAAGTCCACGGCGCTCTCGGCGTCCACCGCGTCCACaSCGTTCTCGGCGT^ 

aSTCCGCCGCCGCCCCCGGTGGCAGGGGAGAGTCCT^CCGGTGCCGACGCGGGCGACGTGGTGGCGCGGGCGTACT^ 

AGCAGTTCGGCCCCXSATCTCCGCCGCCyVGCAGGGAGGTGATCCCCGACGGGTCGTACGCCGGGGACA 

GAAGCCGACGGGCCTGAGOTGCCCGACCAOSTCGAGCAGGGTCAGCACCTCGasaSAC^ 

TGCCGGTGCCCGGGGCGTACGCCGGGTCGACGACGTCGATGTCGACGGAGACGTACAGCGGCAGGCCGCCGACGGT^ 
55 CGGATCraCTCGGCGATGCOSCGCGGTGAGCGCCGGGTGAAGTCGGCGGCGGTGACGATGCT^ 

GTAGTCCAGGGAGTCGGGCTCCGGATTGTGGCCGCGGATGCOSACCTGGACCAGGCGCrCC^ 

CX3ATGGCCCAGCGGAAGGGGGTGCa3TGGTGGTAGGTGCCX3CCX3TAGAC^ 

TGCAGGAOSGCGACCCGGCCGTGGCGGGCGTGCACGGCGaSCAGGGC^ 

O^GGAACGCGTCGTTGCGTTCCAGGAGCCGGGTCAGGGCXaACCGTCGCGGTGTCC^ 
60 TGAGGTCGATGTCXXICCCCGTCGACCACGTCGATCCGGTCQAAGACCCCrrGGGCTC 

AGGCTGGACTCGTGCCGGATGGCGCGCGGCGCGAACCGCGOSCCGGGCCGGTAGCTGGTGCCTCCGTTCTAC^^ 

GACGACCACCACGTCATGGCCGATCGGGTCGGGCCGGTGGOSCAGCCGCa^TGAAGGTCXSCCGGTTGG 

AGACGGCGGTGGACAC 

65 SEQ ID NO:15 orfSpara 

ATGCGTGCXTTCTTCGCCCAGAGGGTTCCGCGTGCaiCCACGGTaiCGCCGGGATCAGGGG^ 
CATOSCCTCCGACGTTCCCXSCGGaSGTCGGaKXSGTGTTCACCCGTTCGC^ 
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GGGACGCXKSTCGCCGACGGGATOSCCCGGGGaSTGGTGGTGCraTCCGGC^ 
TACGAGGACGCOKXKSAGGTGCGCCM'CI^TGGCC^^ 
GGGACCCGTOSGCGAGCGGTATCCGATGTCCCGTGTCaKSGCCCATCTGa^ 
ACTTCGACGGCGCGGCGGCGGCCGTGCTGGGCACCGCXSGGC^ 
5 ACGCTGATCGGTGTCGCC7UVGGGa:CGGGTACGGGCK:a3GCGGAG^ 
GGACXSCCO^GGTGAGCCCCGTCXSTCCrCGACXSACATCraCCG^ 
6aX:CGACGCCTCCACCGGCGACAa3Ga3GCa3TTCTCG^ 
CAGGTCCIX3GGa3CGCTGGCGCTX3GACCTGGTCAGG^ 

GCGGGTCACCGGGGCCCACGAGACCGAGCAGGCCGGGOSCXSTGGGCCGGGCGGTGG^ 
1 0 CGGTGCACGGCCCGGCACCCGACTGGGCXSCaSGTCGCCGCOSTGGCXS^ 

CCCGGGCGGATCACGATCCGGGTCGGOKKXXSGGAGGTCTTCCCCGCCCCCCGCG^ 

CGCGTATCCGCACGGOSGCGAGGTGACaSTCCATATaSACCTCGGTGTCCa^^ 

ACGGCTGCGACCTCCrGGCGGGGTACCCGCXK:CTa3GCGCCGGCa^^ 

15 SEQ ID NO : 16 para cluster 

CCATGGGAGCAGCATCGCAGTGCGCCTCCCCGGCCGCCa^TGCaSCTAGC^ 
GGGaSGTCCCGGGTGCGGOSGCCGGATCTAGTCGGTGTGCTCCGAC^ 

TGGTGGTTCCCGCGCCGGGCGGGCTGTGCAGCCGCAGTTGGCCGCOSAGTM 

CCCGAGCCCCXX3CAGGGGGCGGCGCCACaKX3GCTCTCX3TCGCGGATGCC^ 
20 ATGGACGTCGACGACGGTGGCy^CCGGAGTGCOTGGCGGCGTTGGTCAGGGCCTCG^ 

CGACCGGTTCGGGGTGGCGTTCCCCGGTCTGGATGTCGAGCCGGACCGGGATGGOSGAGaSCC^^ 

GCCGGGCGGAGTCCGCCCrCGGCGAGTACCGCCGGGTGGATGCCCCGGGCGACCTCCCGGAGTTOT 

CAGCCCGTOXSTCACCTCGTCGAGCTGCCGGATCAGCTCGTCGGCGTCGAGCGGCACCG^ 

GCAGCGCCAGGGAGACCy^GGCXSCTGTTGGGGGCCGTCGTGa^GGTCGCGTTCX^ 
25 GOSACGATCCGGGCCCXSTGACGCGGTGAGGGCCGCCTGCGTCTCCGCGTTGGCGATGGCGGTGGCCAC^^ 

GCOSGCCAGCCGGTCCTCGGTGTCCGACGGCATCGGCn^TCGTTCATCGACGC^^^ 

CGTOSACGTTGATCGGCATGCACACCGTGGCGCGGAATCCCCACrCCTTGCCGACGACGGAG^ 

GCCGa5TAGTCGTCGATCCGa3CCGGGCAGCCCGACTCGAA(^CCAGGGTGTGCACAOT 

GATACOKSCGGGAAAATCACGGCaSGTCCTGGTCCAGGCGGCXSACy^TACA 
30 GGACCGCXSAAGTCGGCasa^GAGGAGCTGTCCGGCCTCGGaSGCGACC^^ 

GCGACCAGGGTCGCCACGCGCCXSCy^GCGCaSCCrGCrCCTCXK^^ 

GATGGCGGTGGCCACGAGGTCGGTGAAACCGGCCAGCCGGTCCTCGGTGTCGGGaSGCAGCG^^ 
TCGCCATCATCACGCCCOVCy^GCCGTCCCrCGACGTTGATCGGa^CGCCG^^ 

AAGTCGGCGGGTGCCCCGGACGACTCGGCGGCGTCGTCGATCCGGGCCGGCCGCCCCGTCTCGGACAC 
35 GTTCCGGCCGTCGGGGTCCACCCGGGTGCCGATGGGGAAGAGOSGGCaSTGCAGACTTCrK^ 

TCGCCATGCCGTCCGGATCGAGCCTGATGATTCCGGTCACATCGTTGCCGAGCAGTTCTCCGACTTCGGC^ 
GCGAACATCTGTTCCGGTGGGGTGGCCCTGGCCACCAGGGTOSCCACCCGTCGGAGTGCC^ 
TTCGCACGACACGACCGCTGCCAGGCCCCCCTACCCGCCCGATGACGCCCGCATACOMGTATC^^ 
ACGTCCGCCGTGAACGCCCGTCAACGTGGCCCGCCGGAGTCGGGAACACGCGTCaSGAATCT^GCCCCCGG^^ 
40 CCGTCTTCCTCCGTCCGGCGCX3GGGCACTGCGCCGCGGCGGAATCCGCCCrGACCTCGGGAGTTTG^^ 
CAGOKSTTCGGGTTGGTGGGAAGGGATGTTGGCCGCTGGCGGCGATGCGGAAGCCGATCGTTC 
TGOSTCGCGGAGAGTCGGTCCGCTTCCCaSAGTGGGCCXXIGACGACGCT^ 

CCGGCGAAGGAGCTGCCGTGTCGQACGTCTTCXSCa^TCCQAGAAGAGTTCGCCCGGTGTCCGGACCCGCGCG^ 

CaVCCGCGCTCTGTO^TCAGCGCCGTCGGCGCCGTCAGCCa^CGCAG^ 
45 TGAGGTTCGTCACXSACGTCCCCGGCCTGCCXKSGTCCGTCACCGTCCATCACOT^ 

ACGGCCGGAAACTGGAGCrGGGCCCTCCGCGTa^GCGGGCCX5TTTTaK:CCTO 

CCGGTCGACTCGATCGTCTTCCGTATCTGGGGCAACTOVCCACCGGGCGCXSGTa^C^^ 

CCGGCTGCGGAAACTCCTGGCCGAGTGTGTGCTCCaSGAOKSTTCGACACCCGAACT^ 

CCCTCGCGCTCGGCACCGAGCACATCGACGCGAACCGTTTTGAGCAGGCCATCAGGACAGGGCraCC^ 
50 GAGCAGCACCAGGAGGCGCGGGCCGTGCTCTGCCAGGCCCTGCTGAGCTGGGGCGGGACACaST^^ 

GTACGACTTCGCCGTCCAGGAGGCCAATCGGCTGGAGCAGCTCCGGCTGGGCGCCGTGGAGACAT^ 

TGCGGCTGGGGCGGGACGAGGAGGTGATGGACCAGCTCT^GCCGGAGGTGCAGCGC^ 

GGGCAGCTCATGCAGGCGCAGTACCGGCTGGGGTGCCAGGCGGACGCGCTCAGGACGTATO^ 

GGCCGAGGAGCTGGGGACCGATCCGGGCAAGGAGCTGGCGGCGCTGCyvCGCGGCGATCCTGCGTC^^ 
55 ACCGCGTCGTCCCGGCGTCCGCGCCGCOSTCGQCGGGGGTCGGGCGGGGGGCCGTGACGGTGTCG^^ 

TOSAGGCCXSTTGACGCGGCCGGTGGCGGGGCGGGCGCGGGTCCCGGGGGCG^ 

CCCCGCGTCCGCCTCCGGCTCCGTTTCaSCGTCOSTTTCCGGCTCCGGCTCTO 

CCACCTTCTTTCCCGGCTCCGTTTCTGGCTCGGCGTCOTTTGCTC 

GGGCCCGGGTCCGCTTTCGGGTCCGTGGCGCTCCACCGGCCGCAGACCCTCCGGGGCGAGCCG^^ 
60 GGGGATGCGCACCGGGCAGGTGTTCCCCACGCTGCCGCCGTTCGTCGGGCGCGGCGACGAG^^^ 

CCGCGACGTCCGCGTTCCACACCTCGGGGCGGGTGGCGTTCGTCGTCGGCGAGGCGGGCAGCGGOVAGACCCG^ 
TCCGAGTTGGAGCGCTCGGTTCCGGACAGTGTGCGCACCGTCTGGGCGTCCTGTTCGGAGAGTGAGGACCGGCC^^ 
CTGGCCGTGGACGACCGTGCTGCGGCATCTGTACGCGATGTGGCCGGAACGTATGCACGGATTCCCCGGT^ 
GCGCACTOK^GGAACTGCTTCCCGAGGTGGGCCCGGAGCCyVCAGGGGCCGCACTCCCCCGACGGGGGC^^ 
65 GQCAACGGGGACGGTGCGGGCGACGGGGACAGCACCCCGGCGCACACCCTCACGCTOT 
CTCCyVGAGAGGCTCGTTTO^CCCTGCACGAMCCGTGTGCCAGGCGCT^^ 
TGCTGGAGGACATGGAGCGGGCOSACGCCCCCTCGCTCGCCCTGC^^^ 
CTGCTGCTCGTGGTCACa^CGCGCACCTTCCGGCTCGCX3CACGAaK:cm^ 
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GTCGACa3GOGaKX3CCGGGTCCTGCKSAACGCX:<nX3GAC^ 

AGGCCOXKSACACCCTCCTCGTACGGGCCCTGCACXSAGCGCKXXKXX^ 

TCGCTCa3GCAGGGGCTCGCCGCC»anKKK3AGACGGAGATCCCGGA 

GAGCGTGCCGCCCGCCGTGCGCa3GGTGCTCX3AaVTCTGa3CGGTCGTGGAG(X3CAGTT^^ 

CCSTGCTGCGCCATGAGGGAATCCCGCTGGRGAACGTCCGTACXX3CGGTCCGCX3GCGGTC^^ 

GACCa2GGG<:XMCTGAGGTTCGTGCaTa:GCTGGTCCGGGAGGCa3T^^ 

CT<XCGTTCCTCCGa3CTCXKK3G<:XKn^CCACGGTCTX3AGTCCCGGGCCCCG<^ 

GCGC^CCOSACGCCGGGCTTGATCCCCCGGGGCAGCCXSGACGaSCAGCXXKSGTGC^ 

GGCX3aCGGCCGTGGCXX3GTCGCa3CCCCCa^a3GrcCACa3AGGAGCCCCCATTGG^ 

CGCGGTa:GGCACCaVCCCCaAGCCXKX3TC(XX3ACX3CACCrCCCCACGa^ 

GAGCaKSXX:a3GACCCGQaCGCCGA6GCCXKX3TGGCTGCTCMCGGC^ 

CCGGGGCraSCGAGGACCGCaCCGTTCTGGTCTCCGGCaSCGGCTGCACC^ 

AOSCCTCGTCGGTGCTCGGACTGACCCAGATCXSGCCATGaACGTaAGGAQA 

ACACTCGGTCACTTCCACACCTGGGGCACC3m».GCAACGACAAGGaa^TCC^ 

GCCCCyvGGGTCTCCAGCGCXSTCTACnTCACCyVGCGGaSGaSGCGAGGGCXSTCGAGATCGC^^ 

TCCACCACCGCACCGGCAGCCCGGAGCGCACCTGGATCTTGTCGCQCCGCACax:CTACCACGGCATCGQCT 

GGTACGGTGTCGGGCTCGCCCGCCTACCAGGACGGGTTOSGCCaSGTGCTGCCCrATGTGCACC^^ 

CCCGTACCAa3CCGAGCrGTA<^CGGCX3AGGACGTCACX3GAGTACTGCCTGCGCGAACTaK:^ 

TCXMCCCCGGGCGGATCGCCGCGATGATCX3GGGAGCCGGTCATGGGCX3CGGGCGGCGCCGTCGTCCCXK:aK:C^ 

TGGCCGCGCGTCGCCGCGCTGCTGCGCTCCa^aSGCATCCTGCTGATCCrGGACGAGGTTO^ 

GGGGACX:TGGTTCX3CGGCCX3AGCACrTaK3GGTGACCCCCGATCTGCrr^ 

TCCCGCACaGGGCGGTGCTCCTTGACCXSAGQAGGTCGCGGACGCCGTQAACGGGGAGACGGG^ 

TATACCGGTCaCCCCACQGCQTGCGCCXSTCGCGCraSCCAATCTCGACATCATCmACXKK^ 

GGTGAAGGTGGGCGACCACCTraMCGGQCGGCTGGOSGCCCTGCGCGGGCrGCCCGC 

GCATGATGCTa3CCGITXSAGCrGGTGTCG(»aAGACQGCCCGC^ 

GCXSCTGCGCGAGGACGCGGGCGTO^TCBTCaSGGCCS^CGCCGCQCTCCCTGQTCCT^ 

GGCCACGGCGGAaSAGGTGGaSGACGGQCTXSGACTOMTQCTGCGGCGGC^^ 

CCCGGCGGGGGTGACGAGACCGCGGGCCGCCACCCQCGGGGGGCGCCGGGTCGGC3VCAaCGGCa»^CCCX3QC^ 

CGTTTCCCGGCGCCTTTTCCGTGCCCCGGCGCCGTTCCCGTGGCCCCTGCCCCTGCCCC^ 

GCrrGTGGCGCCGTTCCCGTTCCAGCGCGCTGTCGAGCOKCGCCAAGCGCCCaSTGCC^ 

CGGGGCGCGCGGAGCCCGGCAAGCCGAAGGGAAGTCCCGTCaSATGCGTGCCTCTTTCCCCAGAGGQTTCCGCM^ 

ACGGTCACX3CCGGGATCAGGGGGTCCCACGCGGACCTCGCCGTCyVTCGCCTCCGACGTTCCCGCGGCGOT^^ 

TTCACCCGTTCGCGGTTCGCCGCX3CCGAGTGTGC?rGCTCAGCCGGGACGCGGTCGCa3ACGGGATCGCCCX3GGGaSTGCT 

GOTGCTGTCCGGCAACGCCAACGCCGGGACGGGCCCGCGGGGGTACGAGGACGCCGCGGAGGTGCGCCATCTGGTGG^ 

QGATCX3TCX3ACTGa3ACGAGAGGGATGTGCrGATCGCCTCCACGGQACCCGTCGGCGAGCGGTATCCGATC^ 

CGGGCCO^TCnXSCGQGCGGTGCQCXSGGCCCTTACCGGGTGCCGACTTCGACGGCGCGGCGGCGGCCGTGCTG^^ 

GGGOSCCaSTCCCACGATCCXraCQGGOTOTGTGCGGCGACGCGACGCTGATCGGTGTCGCCAAGGGTC 

CGGCGGAGCAGGACGACCGGTCGACOCTGGCGTTCrrCTGCACGGACGCCCAGGTGAGCCCCG^ 

T^TCCGCCGGGTCGCGGACCGCGCCTTCCAOSGGCTGGGCTTCGQCGCCG^ 

CGCCAACGGGCTCGCGGGCCGGGTGGACCTCGTCGCGTTOSAACSMSGTCC^^ 

ACGTCGTCCGGGACAGCGGCTGCGGCGQOGCCCTGGTCa^TGCraGCmaCCGGGTC 

CGCGTGGGCCGGGCGGTGGTCGACGCGCCGTCGCTGAGGGC(a3CGGTGC3^GG^ 
-CGCCGTGGCGGGTGGACACGGGGACGAAGGCCCCGGCCGGTCTa:CGGGCGGATav^^ 

TCTTCCCCGCCCCCOSCGACCGGGCCCXSCCCGGACGCCGTCACaKXSTATCCGCACGGCGG^^ 
• GACCTCGGTGTCCCGGGCCGGGCGCCCGGCGCXSTTCACGGTCCACGGCrGCGACCTCCTGaCGGGGTA^ 

CGCaSGCCGGGCCGTCTGAAOSGGCGCTCCCGGGCGGACGGCGACCGaSAGGGCGCXKS^ 

GGCCCGGTGGTCGATCGGCCACCGGGCCCGCTCCCGTCQTTCCGTCCGCTGTCCCCGGCCGCCCTACCCCC^ 

^GCaAAaTCCACGGCGCTCTCGGaSTCCACCGCGTCCACaKGTTCTOKSCGT^^^ 
GGTGQCMGGGAaAOTCCACCGGTGCCGACGCGGGCGACGTGGTGGCGC^ 

S?S?S??SSa^^tccccgacgggtcgtacgccggggacacctcgacc^^^^ 
StcSSacca^cgagcagggtcagcacctcmcgcgagqacaot^ 

?ASSS5SSSSTCGACGGAaACGTACAGCGGCAGGCCGCCGACGG^^ 

SSSSGTGAGCOCCGGGTGAAGTCGGaSGCGGTGAOSATQ^ 

^C«CGGATTGTGGCCGCGGATGCCGACCTGQACCAGGCXK:TCXXK3GTC^^ 

GGGGTGCCGTGGTGGTAGGTGCCGCCGTAGACGGGTGGGTTGGTGTCGCTGTGCGCGTC 

^GTGGCGGGCGTGCACGGCGCGCAGGGCGGCCAGGGAQAQCSAGTGGTCCCCKC^ 

GTTCCAGGAGCCGGGTO^GGGCGACCGTCGCGGTGTCCATCQCXaVGGTCCATCGAGAAGGGGC^^ 

ccgtcgaccacgtcqatccggtcgaagacccctgggcccosgtcgatgccgacgccgtcga 

GATGGCGCGCGGCGOSAACCGCGCGCCGGGCCGGTAGCTGGTGCCTCCGTCGTACGGGGCGaXSACGAC^ 
QGC^TCGGGTCGGGCCGGTGGCGCAGCOK^ATGAAGOTCGCaSGTTGGGCGTAGCGCGGGGA^^ 

SSS5?ScSSScccSccSgctcccgttcccgtacc6acgcccg^ 
SS^SSSSSg^SSctScgttcccgcgtggaatcccgttcccgc^^^ 

TCS^SSS?SS^SSScGTTCCrGCGGCCGTTGCCGCTCrGCX3GGC^^^ 

S?SSSSSSS?^^Sgttgccgccgccggtgccgt^ 
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GAGGGAGACCAGCXKX3TCCAGGACCGCGCGGTCCa\GTACGGGTGGOTGGTC 

GGGACATCTCXSTTGAGGCCXSTCGAAGCCCGCCATGTCGCCCGaSATCTasra 

CGGTGCATACCGCCGAGCGGGATGTCGGCGCa3TACCCXK5TGAGGATGCGGAG(^ 

GGCGAOSAGCGGCAGCAGGTACTCCAGGACCGTGGGGTCGGTGATCT 

CGAGTTOSGCCGAGTGGAGCCGGATCTCGCTGTGCXSCGGTGCCC^ 

TOSGACACCTCXKSTGCCCATCXSACACGGACCGTGTCCCGGGTG^ 

GATGCCXXrOKSACAGGACGACGGTGGGGGCTOCCTCCCaSCCXK^ 

CGACCAGGTCaiCCGCCTCCCGTTCGCCGGGCAGax:CCXK3GAC3AG 

ATGTCGGAGCCGCCGACrCCGTGCAGCAGGAGGGCGGTCCCGGCGGGGACCCGGCAGAC^^ 

GTGGGTGCCGGACAGGCCCAGCGGCCGGCCCGGCTCX5TGCGCCAGGCT 

CGTCX3GCXK:GCAGCX:Aa^GaK3TACCGAACCX3Ga3a^ 

AGTGCGGCGAACCXSTCCGTTCAGGAGCCGGAAGGCCCCGGGGCCCCAGCGCaSCCa^^^ 

GCCGAGGGCGGCAGAGGAGCCGCCGAGCGCTCCGGTCAGCTCGGCGOSGTTGTACAGCTCGCCOT 

CCTGGCCGTCGGCX3ACaVGGACGGGCGGACX3GCCCAGGGTCAa3GCC^ 

TGCACGGGGACATGGGTCCCGCGGAOSGCGT^GCGGGGTGCGCTGCraOT 

GCGGCCCTCXKSTGCCGATGCGCACCCGGAATCCGTACACGAGGTCGGGGCaSGGCAT^ 

TCAGATGGCCAGGGCGGCGA/VACaSCCGGACTGGAAGTOSTAGGCCyVCCOT 

CGGCGCCCTTGGTGAGGGCGGCGAGCAGOSAGGTGCGGTCGGTGGCGaSGA 

TGGAa3AAGTCGACGCTTCCGAAGCCGACGGCGGGGGCGTGGGAGCGCTGGTGTCCX3^ 

GCCGTTGCGGTCGTTGTTGACXSACGACCATGAaSATCGGCAGGCCCAGGa^ 

AGTGGAAGCCGCCGTCX3CCCGCGATGAGGAAGACX3GGCTCX3CCX3GGCCGGGCGAT^ 

CaSTAGCCGAAGCTGGAGCAGCCaSCGGAGGTGAGQAATCOGTACGGCTGGTO^ 

GCGGAAGAAGCCGATGTCGCnXSACGAAGGTGCCGTTGTOSAGGACG^ 

TGCCX3TCCTC:X3TACT(X3GTGGGGTCX3Ga5AGGAATTCGGC^ 

GGGGCGAGGCCaSAGGTCGCGTaSTCGAGCGCGGTGACGAATTCGGOSAOT 

CAGCTCCGGGATCGGGTTGACCTCGGGGGCGACCCGGACCGTGGTCTTGGCCCHSGCCCCGCGTCCACATGC^^ 

GGTCCTCGGCGTAGTCGTAGCCGATCGCCAGGAGGAGGTCGGCGGGGCCGAAGATCTCGTCGAGGGCCC^ 

ATGCOSTCCATGTAGCCGCTGATGGCGCCXSTAGTTGAGCGGGTGGTCGTGCGGCAGGACGCCCTTGC^ 

GACGACGGGGATGTTCAGCCGCTCGGCGAGGGCGCGCAGGGCGTCGACGGCCCCGGCGCGGATGACGGCGCTACCGACC^ 

CGAGGAGGGGGTTCTCGGCCTCGCGCACCT^GCTCT^GCX^GCCTCGTCGAGGCGGGCGCGCCA 

GTGGCGGTGGCCCGGACCAGGGGGGCGTCGGTGGGGGTGCCGTTCAGCTCGGCGCCGAGGAGGTCGACCGGCAGGCT 

GAAGCTGGGACCCT^CGGGCTCGATCCGGCTGTTGAGGAaKjCGCTGTCGACGAGGTTGACGATGTCCTCGCC 

GCTGGACGCTGAACTTGGTCAGCGGGCCCATCTVCGGCGGTGCTGTCCAGGCACT^ 

TACGACTCGGACTGasaSGCCAGCGCGATGACCGAGCTGCGGTCCAGGGCGGAGGTGG 

CATGCCGGGGCCCAGGGTCGa3AAGCACGCCTGGGGGCGGTTGGTGATCCGGGCGAG^ 

TGAACTCGTGCCGGGTGAGGACGAAGTCQAGTCCTTCGACCTOSTCGAAGAGAATGGCXXSA 

CCGAATACaiTGGTCGACACCGTACTGGTGAAGACGTTCCAGCATGGCTTTCGCGGTC 

CGCATCGGACGGGOSCCGGGATGGCGCCCCGGAAAACGCXSGCACCGGGCGGTGCGCA 

GTGGCGTTGCCACTGTGCraGATaSCCTCTTGGaMCGGTCGGACGC^ 

GCATGGCGTCCATCGTCCTCGTGGCGCTTTTCGTGAAATCCGTCCGGCGCCGACGGTCTCO^TCCGATTCaSTCCCC^ 

CGTCCACCGATCCGAGGAGAATCCyvTGGATGTCCTGGCCGCGTTGGAGCGCT^GCCCAGCCTGAATCOT 

GAACCGGCTOTCGCCGCGCGCCAGTGCCGCGCTGGCCACCGACGCCGTCAACCGCTATCaSTACTCCGAGACCC^ 

CCGTCTACGGCGATGTCACGGGGCTGGCCGAGGTGTACGCGTACTGCGAGGACCTGGCCAAGCGCTTCTTCGGGGC^ 

CACGCCGGTGTGCAGTTCCTGTCCGGTCTGCACACCATGCAC^CCGTGCTGACTC 

CCTGGTCCTCGCGCCGGAGGACGGCGGCCACTACGCCACGGTGACmTCTGCCXKSG^ 

TACCTTCGACCGCCGGACACCTGGAGATCGACT 

SEQ ID NO: 17 cvm cluster 

GGTACCGGCATCCGACCCAGGCCCCGGGCGCAGGACCCGGAGGCAGGCACCGGCACACCC 
CGGCCGGGCGGCCCGGCTCCCGGCGGTCGGTGTCCGGCGACCCGCAATCGGCAGCCGCCC 
CAGGCCCGGGACAGGAGCCCGGCTCAAGGCACCGGCCCTGCGCACCCGCTGAGGCGGCAG 
GTTCCTGACAGCCGGCATCCGCCAGTCGGCGCGGGGCAGCCGCCCCAGGCX3CCCGGCCCG 
GCACACCCGTGCGAGCGCCCGGCTCCCGGCGGTCGGTGCCCCGGAGGCGGCGACCGGCAG 
CCGGACACGGCCCCGCTCGGGGCGCGGCCCAGGGCACAGGCCCTGGGCACCCGCTCGGAC 
GCCCGTTCGGACAGCAGGCCCGTGGGAAGCCGCCGGTCAGGCCCGCAGGCAGCCACCGGT 
CGGCGGGCGGATCAGGTGTTGGCGGGGGACTCGTCCGGGAAGATCTTTGTGACGAOGGfc 
CCXSTCCTCGGTCAGATAGCCGTGCAGCATCCCGGGGCroCTGTGCGGCGCGTCGAAGTCG 

CCCCGGGGGTCGAGGGCGATCACK3CCGCCCTGCCCGCCGAGCCGGGGCAGGCGCTTGACG 

ATCACCTCGTAAGCGGCGGACGCCACGCCGAGCCCCTTGAACTCGATCAGATGGGAGAGG 

GTCGAGGTCGCCGCGCCCCGGATGAACACCTCACCGGCGCCGGTGGCGCTCGCGGCGACG 

GTCCGGTTGTCGGCXSTAGGTCCCGGCCCOGATCAGCGGGGAGTCGCCGATCCGGCCGGGG 

AGCTTGTTGGTGAGCCCGCCGGTGGAGGTGGCCGCCGCGAGATCGCCGCGCCGGTCGAGG 

GCCACCGCGCCCACCGTCCCCGTCGACTGCGCGTCGGCCAGTGCCTCCGGGGCCCTCCGG 

GCGGCGGGATCGCCCGCCTCGGTCTCCTTCGCGCGCAGCAGCGCGTCCCAGaSGGCCTG 

GTCCAGTAGTAGTCCTGGGTGACGGTGCGCAGCCCGTGCCGGGCGCCGAAGTCGTCGGCG 

CCCTCGCCGGAGAGGAGGACGTGCTTCGACTTCTCCAGCACCAGCCGGGCGCCCTCGACC 

GGGTTGCGCAGGGAGGTGACCCCGGCGACCGCTCCCGCCTTCAGATCGGAGCCCCGCATC 
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ACXy3AGGa3TCCAGCTCATGCCCGGa3TCGGCXKjTGAAGAC^ 
AACAGCGGGTTCTCCTCCAGTTCGCGGACGGCGGCCT 
CCGCGCGCGAGCACCCGCTGTCgjGCXSCGGAGCGCT^^ 
^CT(XXX5TTCCGGGCCGGTCX3TCTCCCGGTCa«3GG 
5 GCGATGACCACX5TCACGGGCGTCCGGCXX3GGGCITCCCCGGC^^ 
TTCTCCTCCaSCGCCTGCTOCTCCTGCTTCT^^ 
CCATGGCCGCCCGAGGCCCCGGGTACGACX3ATGAGCX5TGGTCGT^^ 
AGCAGGGAGGACGCGAGCCAGGCGGTGGCGGGGOSGTGGGGCATC^ 
CGGGGGTGAGAGACGCTCCGGCaaACTGTACTGACATGCCCATGCC^ 

10 GGAGCa3CCTTCCGCCgrCCCCGCCGCCa3GCGGCGCC(:XK:CC^ 

GCCAGGTCCTCCGGGGCX3GAGCGGGCGAGTCCGGCGAGTGTGC<:X3A(^ 
TCGTCCGCaSAOXSCGACGCCAGGCCCAGCCGGACCGCGTGCGGTGTACGGCCCTG^ 
GCGCAGAACX3CGGa3GCGGGCGTCACCCCGATCCCGTGCCXKX3CX3GCGG 
GTGTCGGCGOaCCAGGGGaKXXX^AGCACCCACCAGCA 

15 GACACGGCGAAGCCGTCGAGCGCGOSCCGGGCGATCrCCIG 

CXX^TTGGCGCGTACCAGaSCGTCGACCGTGCCGTCGGTCTGCCaGCGm 
AGCGCGAACCXKX3aX3GGCa3AGACa3CCX3GAGCGa^a3CGGC^ 
AGCCCCGGGGGCACCaVCaSCGAACa^CAGGGTCAGCCaSGGGGCGAGCC^ 
CTGTCGACGAGCACCXSTCCGCCCGGGGGCXSACCGCCGCGAGCGGAGCC^ 

20 AGGAAGCCCCAGACGGCGTCCTCGACCGCGGGAAGGTCCAGCCX3CTCCAGGACCGCGGCG 
AGCTGGGCGAGACGCCOSTCCGACaGGGTGAGGGAGAGaaGGTTGTGCAGGGTGGGCT 
ACATAGACay:CCGGAGCGGAGCGCTCCGGTTGGCCTCX5TCCAGCGCCTCCGGAATCACC 
CCGTCCXXX3TCCATGGCGAGGGGGACGAGCGTGATGCCX3AGCCGGGCC^^ 
ACCAOySGGTAGGTCAGCTCCrCGACCCCCyVGTCXKSCCCCCCGGCGG^ 

25 AGCACGGCGGAGAGTGCCTGCa3ACCGTTGCCCGCX3AACAGCACCCGCa3^^ 

CGCCAGCCGCCCCGGGCGAGCAGCCCGGCGGCGGCCTCGCGCGCOTCGGGGGTCCCGGC^ 
GCyVCCGGCCGGCaSGAGCACGGACTCCAGGACATCHSG^ 

GTGGCCAGCAGCGCGGCCTGCTCGGGGAOaACGGGGTGGTTCAGCrCCAGGTCGATC^ 
CTTCCXSGCGGGCTCGGAGAGOSCGGGGCCGACGCCCGCCCGCGCCXS^ 

30 CCGCGCCCCACCTCGCCGACGGTGAGCCCTCTGCGGGCCAGCTCCCGGTAGACCCGGGCG 
GgySTGOAGTCGGOSATGCCGCACCCXSCGGGCGAACrCCCXK^TGC^^ 
CCGGGGCGCAGCCCGCCCGTCCTGATCTCCTCGGCGACCGCGTCGGCCACCTGCCGGTAG 
TCCTTCATCTCCCGTACCTCCCCTGTCCX3GTGGACCGCTTCCCGCCCGGCCCCX3CCGACC 
GTGAAAOSGAAGCACCCCGTTCCGGAGCTCGAGCTCCCCGTCCGGAAGCTCCCCGTCCGG 

35 AAGCTCCCCGTTCCAGAATTGCACCGAGAGCAATATTCCCTATTGCACCGATCT^^ 
CGATCTACGCTCGGAATTGCCTCACACAGACCGTCGACGCATCTGCCGCACACCXyST^ 
GACGCCCCGTCGGACCGCACCCGCGCGGAGCCX3TCGCCCCGCCCGCCCCGTTCGCGCACA 
GGAGAGAGAAGGAGATGGTGGAGACCAGCGCACTCGCCGGTGTGGTGATGGTCGCCCTCG 
GAATGGTCCTCACCCCGGGACCGAACATGATOTATCTCGTCTCCCGCAGCATCACCCAGG 

40 GCCGACGTGCGGGGATCATCTCGCTGGGCGGTGTGGCCCTCGGTTTTCTGGTCTATCT 

TCGCCGCGAATCTCGGCCTGTCGGTGATCTTCGTCGCCGTGCCGGAGTTGTATGTCGCGG 
TCAAACTGGCCGGTGCGGCCTATCTGGCATATCTCGCCTGGAACGCCCTGCGGCCCGGTG 
GCGTGAATGTGTTCTCCCCCGAGGAGGTTCCGCACGACTCCCCGAGCAGGCTGTTCACCA 

' ' tggggctgatgacgaactvtcctcaacccctagatcgccgtc^ 
45 ascagttcgtaaacccgaacgcggacastgtcctgttccaggggcrgattctto 
tccagatcgck3gtgagc!gtcga3gtc^tctcgcgatcgtgct^ 

CCGCCTTTCTCGGCCGCC7VCCCCTTCTGGCTCAGGGTTCAGCGCax:GTGATGGGCG 
CGCTCXSGTACGCTCGaKSTCTCCCTGGCCCTCGACACCTCCGCCCCCGCCGCACCCGTCT 
CCTGAGGCCGCCGGACCGGGAGCOaACGCGAAGGCACCCCTOGGCaACCGTTa^ 
50 TTATCCGTTACCCC7VTGAATCCCGATATAAGTGCa.TTGGCCACrTACCCATGCATGGAAC 
AGGCCAACCTGACCAAAAAATGAGCCCTCCCCACCCGGAATAGATGCTTCCCAGTO 
GAAATTTCATAGCGGGAGCGTCTGCCGAACAGGACGGCCCATACGCCGC^GGCAGAATO 
GA(^TCGCCGCCCGCCCGGGTCCAGAAAATTCGGAGGACACATCGGACGACCGTCTCCGC 
ATCGGCGTCJ^CTCCCGATTACAGAGAATATTGAGTACGTATCAACCGGGCCTTGATCT^ 
55 CTCAGCCrCCATTGTTCTCrCCAGTOSGGATGTGCAATGAAGTACGACATAACCCCACCA 
TCCGGCCTTCGGTTCGACCTCCTCGGCCCGTTGACCGTGACCGCCGGCGAGCAACCCGTG 
GACCTGGGCGCGCCACGGCAGCGCGCCCTGCTCGCCCTGCnHSerCATCGATGTC^ 
GTGGTCCCGCTGCCGGTCATGACCGa3TCX3ATCTGGGGGGCCGACCCACCX3TC^ 
CGGGGGACGCTCCAGGCTTATGTGTCCCGACTGOSGAAACTCCTGCACaSCCAT^ 
60 TCCCTTCGCCTTGTCCACaVGCTCaUaGGGTATCTCCTCGAAGTGGATTCXj^ 

GACGCCGTGGTTTTCGAGACACGTGTCAGGGAGTGCCGGQAATTGAGCAGGGCCCG^ 
CCCGAGGCCACCCGGGCCGTGGCCTGGTCCGCCCTGGAGATGTGGAAGGGCACACCCATG 
GGCGAGCTGCATGATTATGAATTTGTGGCGGCGGAGGCCGACC(3GCTGGAAGGAATCCGG 
TTACGCGCGCTGGAGACCTGGTCCCAGGCXjTGTCTCGATCTCCAGCACTATGAAGAGGt't 
65 GCATTTCAGCTCGGCGAGGAGATCCACCGCAATCCGGAACTGGAACGGCTGGGCGGTCTC 
TTCATGCGGGCCCAGTATCATTCCGGACGGTCGGCGGAAGCCCTGTTGACGTATGAACGT 
ATGCGTACCGCGGTGGCGGAGAATCTGGGGGCCGATATCAGTCCGGAGCrCCAGGAACTC 
CATGGARAGATTCTGOGCCa^GGAACTCAaSGAGACA 
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CTCACACGGGCGGOjGGCCCXSCACGGGCCCCCKX^CCC^ 

GCACCCGa3GACATGGCa3AAACCa^CGGTGGCXy3AGGAj^ 

GCGGCGCCCGGGACCCa3CCC(X:CATGCajTCCCCCX3TACCXKr^ 

GCCGTCCCGCCGGTCa^CCCCXSGTGCCTCCCCCGGTCCCCCGCTCGGCCCTCCGTTCA 

GCACCCGCCGAGACCGAGGACCCGGAACCGGCGCCGCCCCCTCCCCCTCTO 

CGACTCATO^GCCGCCXSCGCCXSAACTGCGCAGGCnH^^ 

GCGGGCCACGGCCATGTCCTGCnty3TCTGCGGCG7UlCAGGGCATC^^ 

OTGGAGCACAC<:X3AGCACACCCTGGCa3CGGGa3CGTTCCGGGT^ 

GTCGCCACCCTCCOjGCACCGGGCTACnxSGCCCTGGGAGCACCrCGT^ 

CCGGACAGCGGCCrCGGTGACGACX3GCGACGCCX3ACCCCGTCGCCCAGGCC^ 

CCGGAACACCACCTCACCCACCAGATGCGGATCTGCCGGACGGTGCTCGCCGC^^ 

CGGACCCCGCTCCTGTTGATCCTGGAGGATCTGCa^CCTCGCCCAOGCGCC^ 

GTGCTCCAGCTCCnXaGTCJ^AACAGATCGGCCAGGCCCCCGTCATGGTCGT^ 

CGCGAGCACGATCTCGCCCGGGACCCCGCCGTCCGCCGGGCCGTGGGCCGCATCCTCCAG 

GCGGGCAACACCGGCACCCTCCGGCTGGACGGGCTCACCGAGGAGCAGAGCCGGGAGCTG 

ATCGTCTCGGTCGCGGGGGCCCCGTTCGCGCCCCATGACGCCCa^CXSGCTCCAGCGC^ 

TCGGGCGGCAACCCXSTTTCTOCTGCTCAGCATGGTCACAGGGGAGGACGGCA 

TGGGCACGGCCGTGCGTCCCGTTCGAGGTGCGCX3AGGTGCTGCACGAGCGGCnX3AGC^^ 

TGCTCCCCGTC<^CCCAGGACGTGCTCAaKrrCTGCGCCGTGCT 

qSACCGCroCTCACCGACATCATGTCCACGCTCGACATCCCGCACACTO 

GCGCTCGGCACGGGGCTGCTGCGCCACGACCGGAACACCGACGGAATGGTCCACTTCGCC 

CATGGGCTGACCCGGGACTTCCTGCTCGACGACACCCCGCCGGTavCCCGCGCCaSCTC^ 

CACCACCGGGTCGCCGCGACCCTCGCCCTGCGCraCCAGCAGGG^ 

ATCCGCCGCC7^CTGTCTGGCaK:GGCCa3TerGCra3GCGCCa3TO 

CTGCTGGCGCTGGCCGACCGGGAGCAGTCCCGCTTCTCCCACGCXSGAGGaKrr^ 

CTGGAGAGCGCGGTCGCX3GTCGTCGCGGCGCTGCCCCGGGACCAGCCGGTGTCCGCCGTC 

GAACTCGAGTTGCGCAAAOSGATGATGGCGCTGCACGCXSCTGATGGACGGCTATC^ 

GCCCGCGTCGAGACGTTCCTCrCCCAGGTCACCX:AGTGGGAACACGTCTTCGACAA(^ 

CAGCCCACCGGGCTGCTG(^CGTCCAGGCGCTGAGCGCX3CTCACCACGGGCCGCCAT^ 

CAGGCGGCGGAGCTGGCCGGGCTGCTGCACGAGCTGGCCmCCAaSGCGGCGGACCGGAG 

GCCCGGTCXKX:GGCCTGCTATGTGGACGGCGTCACCCTGTATGTGGGCGGAa3(3GTCGAC 

GAAGCCCTCGCCGCGCTCGCCCAGGGCACCGAGATCACGGACGCCCTCCTGGCCGGACAC 

CGCAGGACCGCCGCCCCGCACGGCGGCGGGCACCTCCAGGACCGGCGTATCGACTTCCGC 

GCOTATCTGGCXX:TCGGCCACTGTCTCAGCGGa3ACCGGATTCAGACCCAGCGCT^ 

ACGGAACTCCTCCACCTCACCCAGTCGGAACGGTACGACCGGCCGTGGGACCGGGCCTTC 

GCCCGCTATGTGGACGCX5CTCATCGCCGTCACGGAGTGCGATGTCCAGGGGGTGTGGCTG 

GCCGCGCGGGCGGGGCTCX3ACCTCGCCGCCCGCTGCCAGCTCCCGTTCTGGCAGCGGATG 

CTCGCCGTCCCCCTCGGCTGGGCCGAGGTCCACCAGGGGGCGCACGACAAGGGGCTGGCC 

CGGATGCGGGAGGCGCTGCACGAGGCGGCCCGGCACCGGACCCTQCTGCGCCGTACGCTC 

Ca^CCTCGGCCTGCTCGCCGACGCCCTCCAGTACACGGGCGCCCGGGAACAGGCCCGGCGC 

ACGATGTCCTCCGCCGTACGGGAGATCGAGCGCCGCGGCGAGTACTTCTGTCTCCGGCCG 

CAGTGGCCCTGGGCCCGGCTCCTCCACAGCCACGGCACCTCCGCCGCGGCGGAGCACCGG 

GTCGTCCACGGCAGGCACTGACCCGGGGCCGGCCGGAGCCGGGCCCGTACGGTACGGGTC 

CGGCTCCGGACCCGGCGGCCCGGAGCCGGGOSGGGCGGGGCGGCCCGACGGTTCCGGGGC 

CGGCGGTTGTGGGAGGGGGCGGCCCCCGATCGCTCAGACCGGGCAGACGGCGGACCGCCG 

CCCCGCCCGGCCCGAGCCGCCGCCCCCGGCCCAGTGCCCGTAGTCGCCCCGCAGGAAGAC 

CAGGGGCGAACCCTCGCGGATCACCCCGAGGTCX3CGCACCGCCCCGGTGACGAACCAGTG 

GTCGCCCGCCTCCGTCTCCCCCGCCACCTCGCAGTCGAACCACGCGAGCGCGTCGAGCAG 

GACGGGGGAGCCGGTGGCCGTCGTCCGGTACGGCACCTCCCAGCGCCCCGGATCGCCCCC 

GGCGAAACTCCGGCaiLGACCGGGCCCTGATCCGaSCCGAGCACATTGACGCAGAAACGCCC 

GGCCGCCCGGAGCCGOGGCCAGGTCGTCGACGACCTGGCCGGGAGGAAACCCACCAGCAC 

CGGATCGMCGACACCGMGTCAA^ 

AGCCTCGGCaSGACCGGTGACCy^GGACCACCCCXMTGGGATAGTGGarCGCCACCCG^^ 

CyVGOSlGACTCCCGGACACGGACCCGTGGGTGTGCGCGGAAAGGCCCXSGAGGCCGGGT^^ 

AGCCACGGGTAACGCGCGGTGTCCTTGCCCGCX3TAATCGGGGTCCAGATAGACGAAGGCC 

CGGTGGACGAGGAAGTCCCGCACCTCGTAGACCGTGCACCAGCGCCCX3Ga3GCCCACTCG 

GGGTCACCCGCCCGCCACGGCCCGTCCCGGTGCTCACCGTGGGTGGTGCCCTCCGCGQCG 

AGGAGTTCGGTCCCGGTCAGAATCCAGTTGACGGACCACAGATGGTGGGTGATCGAGCGG 

ATGGTGCCCCCGAGGTCGTCGAAGAGCCGGGCGATCTCGGACTTGCCCCGGGCCAGACCC 

CACTTGGGGAAGAAGAAGACCGCGTCCTCGGCX3AAGTAGTCGATCGCGGGGGTGCCGTCG 

CTGCCGACGCCGCCGTTGTCGAACGCCTTGAAGTACGCGGTGATGACCGCCTTGCGCTGC 

TCJGTCCGTCATACCGGCCGATGCCACGGACATGAAACGACCTCCAGAGATTCCGGGTGGC 

TGTGCTGGGGCTGCGGAAGGGGTGTCCCCCGCGAAGGACGGCGGACGCCGCGGACGCCGC 

GGCCGTCTCCCOSGCGGACGGGTCCCAGCGTCCTGGAGAGGGCTTGGCGGCGGC^^ 

CaSTGCTGTCCCQCGGCTTGCGGAACGCGAAGTACCGGCCAGaSTAa^ 

GACGTGTACGCCG6TCGGGACCCCrCGTACCCCCGGAGCaXX:a3ACCCa3GC^ 

GGGGTACGGACGCGCCGGACCGGCCCGAGaSAGCaMSACGGGTCGGACGGTGCGCGTGGT 

TCCGGTGTGTCGGACAGCTCGGACGGACCGGACGGTGCX3CGTQGTTCCGGTGTGTCGGAC 
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AGCTCGGAa3GGTCXK3ACX3GTGCGCX3TGGTTCCGGCACGCC^^ 

CATGGCGAGCAATGCCGGGGTGTACCGCTCCCCGGACACCGGGTGC^ 

CACCTCXXKX3A6GGACCGGTC6TCCA.GCCX»3AT^^ 

ATCGGCa3GOTTCGCXK3TGCCCGGGATCGGGACX3^ 

GGCGA6CGCGA6CTGTGCCAGG6TCA6CCX;CAGACC^ 

CAGCAACGAGCGGTTGCGCGCGAGGGCCGGAGCGC^^ 

GTCCraSTTCCCCAGATCGTCXBGTGGTGCGGATGGTGCCGGTGAGAAAACCC 

AGGGGCGTAAGCmaSATCCCGATCCCCAGCrCCCGGCAGACXSGGCACC^ 

GATCCaXXX^GACGACAGGCTCCACTOSCTCTGCACaS^ 

CGCCCGGCXK:AGCGTGGCCXXX3GAGGGCTCGGAGAGACCGAC^ 

GCXSCACCAGCrCGGCCACCGCACCCAOSGTCrCCTCGATCGGCA 

GTGCnXKSTAGTACAGGTCXSATGCGGTCXSGTGCCGAGAaSAC^^ 

aSCGCGGAOSTAGGACGGCTCGCCGCACAAGCCCTGGGAGGCXSCX^ 

CyVTGCCGAACTTGGTGGCGATCy^GCACCTCGTCCCXSGCGGCCaSCGA^ 

CAGCTCCTCACCGGCGCCGT^SCCCCTGGACGTCGGOSGTGTCCaGC^^ 

GTCXSACGGCGGCXSCGGATGGTGGCOGTCXSCCCGGGCGOSGTCC^^ 

GGTGGTCGGCAGGCAGCCX3A6CCCCnt3GGCACTGACCGGAAGGTCCXX^ 

CGGCGGACGaK3AACaKX3G03GACACGGAACa3^ 

CATACGGAACCraZACAGGOSGAGCCGGGAACGGGACGAGGGOGAGGA 

AAGGAGAGGACGGGAOK^ACAGCACGGACGGGAaSGACGGAACGGAGTCGC^^ 

GGGGTGACCGGAACOSGGCaSTCCTTGGCCCTCCCCCGTCCTCCCCGC^ 

TCCCC03TTCCCTCTCCCGTCCTCCAGCCAACACCGCCGCCCTTTCCAAGa3CT^^ 

GGCACaSACAGCCGCCGCCGGGCGCCaSATGGGGACCOSTGCCCGCCGGTGAGC^ 

GAGCGCCGGTACGGGACCCCACGCGCCGCCGCCCXSGGCGCCCGCCAGGGCCC^^ 

ACCCCGGCCCGCCCCGGCCGGAGCGGCGATCCGGGCCGCTCGCTGCAAGAGGAACATCCA 

OVGCCGCTVCAAGGAGCGCTCCGCACyVGTGGGCa^CCACGTCCGCCCOSTCCC^ 

GGCCGGTCCCCACOKSACTVGCACAGCACCGCACAGCACCACATaSC^ 

CACCACCGGCACGAGGAACCAAGGAAAGGAACCACACCACCATGACCrCAGTGGACT 

CCXKX3TACGGCCCCGAGCTGCGCGCX3CTCGCCGCCCGGCTOCCCCX3GATO 

ACCTGTACGCCTTCCTGGACGCCGCGCAaia^CGCCGCCTC^^ 

COVCCGCGCTGGACACCTTCAACGCCGAGGGCAGCGAGGACGGCCATC^^ 

GCCTCCCX3GTGGAGGCCGACGCCGACCTCCCa\CCACCCCGAGC»^GCACCCCGGCGCCCG 

AGGACCGCTCCCTGCTGACCATGGAGGCCATGCTCGGACTGGTGGGCCX3CCGGCT 

TGCACACGGGGTACCGGGAGCTGraCTCGGGCAOKSTCTACCACGACGTGTACCC^ 

CCGGCGCGCACCACCTGTCCTCGGAGACCTCCGAGACGCTGCTGGAGTTCCACACGGAGA 

TGGCCTACCACCGGCTCa^GCCGAACTACGTCATGCTGGCCTGCTCCCGGGCCGACC^ 

AGCGCACGGCGGCCACACTCGTCGCCTCGGTCCGCAAGGCGCTGCCCCTGCTGGACGAGA 

GGACCCGGGCCCGGCTCCTCGACCGGAGGATGCCCTGCTGCGTGGATGTGGCC^ 

GCGGGGTGGACGACCCGGGCGCCATCGCCCAGGTCAAACCGCTCTACGGGGACGCGGACG 

ATCCCTTCCTCGGGTACGACCGCGAGCTGCTGGCGCCGGAGGACCCCGC^^ 

CCGTCGCaSCCCTGTCa^GGCGCTCGACGAGGTCAaXSAGGOSGTGTATCTGGA 

GCGATCTGCTGATCGTCGACAACTTCCG(^CCACX3CS^CGCGCGGACGCCGTTCT 

GCTGGGACX3GGAAGGACax:TGGCTGCACaKX3TCTACATCa^ 

AGCTCTCCGGCGGaSAGCGOSCGGGCGACGTCGTCGCCTTCaVCACC^ 

CCGGGTCCXSACACCGCGCGGCTGAACCCAOSGTCCGGGGCCCMGGTCa^^ 

GCTGAGCCCCCGGGTCCGGCAGCX3GQCGGCTGAACCCCCGCCCX:GGGCCACC^ 

GCCCCCGCGCACOSGACraCGCCCGCCroTACGGaSGTCCaKrTC 

agcgcccggcggaccgccgccccgccgggggacggacagagccgggtgcgggaggacgtc 
ctcccgcacccggctcccaccgttccgcaccgaccxx».ccc:x3accgtgccgcaggcgc<^ 
ccggcaccgo^ccgcccgcgccggcagccaccacaggcgccaasccgcccgcacggtc^ 

CGCGCTGCTCAGCCCCCGTCCACCGGGCTGTCCAGCyVGCCGCCGCAGCGCGCCCCCGATG 

AACTCCCGGTCGGCGGCCGACCCCCCGGACCCCGCGAGATGCCCCCACACTCCCGGGATC 

ACCTCCy^GCX3AGGCATACGGCAGCAGATCGGCCACCCGCTTCTCGTCCTCGAa3GCGAA^ 

CACACGTCCy^GGGCGCCCGGCAGCACCAaSGCCCGCGCCGTGACGGAGGCCAGCGCCGC^ 

TCGACGCTCCCCCCGGCCCCGGGTGTCGCCCCCACATCCX5TGTTCTCCCAGGTGCGCACC 

ATGGTGAGaiGATCCGCGGCX3Ca3GGCCCXK3AQAGGAAGACCTGCTCCCa^^ 

AGGTACTCCTOKX^TGGCQAAACCCaiGCTCCCGGTGGGavaSGa^ 

OSCGAGQTCCCCa^CCCGGCGAACyiCCCXKK^CaaCaSCCTTCaSCCCCCGCTC^ 

TCGGCGCTGAGOTCCGOSGCaiGACCXSGACAGCaWKSACCy^GGCT 

GGCGCCCCGCAGATCGGGGCGATCCGGCGCACCATCCCCGGATGCGACACGGCCCACTGG 

TAGGCGTGGGCCGCGCCCATCX3ACCAGCCCGTGACCAGGGCCAGTTCCCGTACCCCCAGC 

TCCTCGGTGAGCAGCCGGTGCTGCGCCGCGACATTGTCCTGCGGAGTGATCAGCGGAAAG 

CGGGACCCCX3ACGGGTGGTTGCCGGGCGAGCTGGAGACCCCGTTGCCGAAGAGTCCGGCG 

GTGACGACGCAGTACCGCOSGGTGTCCAGCGGCAGCCCCGCACCGATCAGCCAGTra 

CCGGTGTGGTCCCGGCOSAAGAACGACGGACAGAGCACCACGTTCGTCCCGTCGGCGTTC 

GGCGTGCCGTACATGGCGTAACCGATCCGGGCGTCCCGCAGGACCTCCCCGTCCA6CAAC 

GGCAGTTOSTCGATCTCGAATATGCGGCATTCCACCGCTGACCTCCTTGTTCGATCCCCC 

CGGACAACAGGTCGGTCGTQ GCCGGAGACTCAGAGCCAGTTGGGGGCGATCTCGGTGGCC 
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a^CAGCTCaVGGCTGCXSCAGCTGGACATCGTGCGGGAT^ 
AGO^GATACrCCGGATCGTGCCGCTCCACCAGCTTCTCGATCATGC^ 
GGGi^CCGACCCACTCCAGCCCCCGGTCGACCAGGGTC^^ 
CCaSTCTCGCCGGTOXrGCGCAGCGCCTaSGTGAAGCCC^^ 
5 AAGATGAAGCCGCaSCCGCGGGACGCCCAfiTGGTGGGCCTO;^ 

AGGACGTCCTTCATC7^CCCCGACCCGCTCGCCCax:CXX:AGGGTGCC^ 
TOSGCCTCCrCCCGGTAGATGTCCATCAGCCGGGaSAaSATCTGGTaSTCGC^ 
AGGATCGGCACCACGCCCTCCCGGGCACAGAACaXSAACGTGTCCTC^ 
GGCTGGAAGAOSGGCGGGTGGGGGCXKri^TAGGGCTTGGGCGCG^ 
10 ATGACGCCGTTCTCGTCGAGGCCCCGGCCGTAGCXKSaSCACCGCCTaSTAGGGGA^ 

AGGTCCGGCACCGGGATCXSTCaVCTGCTCCCCGGAGTGGGTGTWVCGTCTOKSTCGTC^ 
GCCTTCTTGATGATCTCCCAGTGCTCCTCGAAGAGGGCACGATO 
CCGGOSTCGGACAGGGTGCCGCCGACCCCGTACACCTGCCCCa^TGATGTCGGCCCa^ 
TTCIXK3AACCCXXXXXK:GATCCCX5ACGAAGGCX3C^^ 
15 GCCAGATCCTCGGCCaVGCCGCAGCGGATTGTGCAGCGGCAGGAaSTTGGCCATCT^^ 

ACCCGGATGTGCCGGGTCTGCATGCCGAGGTAGAGCCCCAGCATGATCGGGTTGTTGGM 
ACCTCGAAACCCTOSGTGTGGAAGTGGTGCTCGGTGAAGGACAGTCCCaVGTAGCC^^ 
TCGTCXSGCOSCCTGCGCCTGCCGGGTGAGCTGCCGGAGCATQTTCTGGTAGTTC 
TTGACCCCCGCCATACCCCGCTGGACCTGCGCATGACTGCCGACCGTTGGCAGATAGAAG 
20 AGAATGGACrrCACCCTGGCTCCrCCGGTTCGCGGCGCCCTCCATTGACGTGC^ 

GOSGCTCGACCGTCCCACTCCGCCCTTGAGTTCCXSTCTGACGCCGCGCCAGTCGGCGGG^ 
CGTCCGCCGGGGTGCCCGCCGGGGTCCGCACCCGCCGGACGGCACGGCGCGCACCGCGCG 
CGCGGCGCTTCGGGGCACCGGGCTCGACGGGGTGCTCAGCGGGACGTCCAACX3GAAGGCA 
AGCCCCCGTACCCaiGCCrGGTCy^GGCGCTCATaSCCATTCCCTGAGGAGGTCCC^ 
25 GACCACAGCAATCTCCGCGCrCCCGACCGTGCCCGGCTCCGGACTCGAAGCy^C^^ 

TGCCACCCTCATCCACCCCACCCTCTCCGGAAACACCGCGGAACGGATCGTGCTGACCTC 
GGGGTCCGGCAGCaSGGTCCGCGAaVCCGAOSGCCGGGAGTACCTGGACGCGAGCGCCGT 
CCTCGGGGTGACCCAGGTGGGCCACXSGCCXSGGCaSAGCTGGCCC^ 
GATGGCCCGGCTGGAGTACTTCCAC^CCTGGGGGACX3ATaVGCAACaACCGGG^^ 
30 GCTGGCGGCACGGCTGGTGGGGCTGAGCCCGGAGCCGCTGACCCGCXSTCTACrTCACCAG 
CGGCGGGGCCGAGGGCAACGAGATCGCCCTGCGGATGGCCOXSCTCTACCACCACaSG 
CGGGGAGTCCGCCCGTACCTGGATACTCTCCCGCCGGTCGGCCTACCACGGOGTCGGATA 
CGGCAGCGGCGGCGTCACCGGCTTCCCCGCCTACCACCAGGGCTTCGGCCCCTCCCTCCC 
GGACGTCGACTTCCTGACCCCGCCGCAGCCCTACCGCCGGGAGCTGTTCGCCGGTTCCX3A 
35 CGTCACCGACTTCTGCCTCGCCGAACTGCGCGAGACCATCGACCGGATCGGCCCGGAGCG 
fSaTrnrafirGATGATCGGCGAGCCGATC T^TGGGCGCGGTCGGCGCCGCGGCCCCGCCCGC 
CGACTACTGGCCCOSGGTCGCCGAGCTGCTGCACTCCTACGGCATCCT GCTGATCTCCGA 
CGAGGTGATCACGGGGTACGGGCGCACCGGGCACTGGTTCGCCGCCGACC ACTTCGGTO^ 
GGTCCCGGACATCATGGTCACCGCCAAGGGCATCACCTCGGGGTATGT GCCG 
40 C GTCCTGACCACCGAGGCCGTCGCCGACGAGGTCGTCX3GCGACCA GGGCTTCCCGGCGGG 
CTTCACCTACAGCXSGCCATGCCACGGCCTGCGCGGTGGCCCn^GCCT^ 
CGAGCGCGAGAATCTGCTCGACAACGCCAGCACCGTOSGCfeCCTACC^ ^ 
GGCCGAGCTGAGCGATCTGCCGATCGTCGGGGACGTCCGGCAGACOKSTC I^ 
TG TCGAACTGGTCGCCGACCGCGGAACCCGGGAGCCX3CTGCCGGGCGCCGCCX5T CGCCGA 
45 GGCCCTGCGCGAGCGGGCGGGCATCCTGCTGCGCGCCAACGGCAACGCC CTCATC^^ 

CCC CCCGCroATCTTCACCCAGGAAGACGCCGACGAACTCGTGGCGGGCCroCGCTCTO^ 
ACTCGCCCGCACCAGGCCGGACGGCCGGGTGCTCTGACCCCTT^ 
ACCGGGGCACCACCCCGCCGCACCCCGAGCGCAAAAAGACCCCTCTGCCTG ^^ 
^TCAGAGGGGTCTGGTGCAGTGGAGCCTAGGGGAGTCGAACCCCTGA CATCTGCCATG 
50 CAAAGACAGCGCTCTACCAACTGAGCTAAGGCCCCGAAGCGACAGAACGGCCCTGGACTO 
C TCCGTCCCGGCCACTGCCGCAGACCAGAGTACCGGGTGTTCCCGGTGA TCCTCCAAAAC 
ATTGAGGTCTCCCGGTGGGCGACCACTCTCCGTAAGATGCTCGACGTGG TTCGC^^ 
GAAG CCCGCTTGGGGAAGCGATGGGGAGACGOXATGGACGCCGCTC AGCAGGAGACGAC 
rnrAAGAG CCCGGGAGCTACAGCGAAGCTGGTACGGGGAGCCC CTGGGGGCCCTGTTCCG 
55 (^GGC TGATAGACGATCTGGGGCTGAACCAGGCGCGTCTCSGC GGCGGTGCTGTO 
CGC CCCCATGCTCTCCCAGCTCATGAGCGGCaiGCGGGCCAAGATCGGCAA 
GGTCCAACGGGTCCAGGCGCTCCAGGAGTTGGCCGGACAGGTGGCCGAC GGCAGCGTCAG 
CGC GGTGGAGGCCACCGACCGCATGGAGGAGATCAAGAAGTCGCAGGGA GGCTCCGTCCT 
G ACCGCGAACAGCCAGACCACCAACAGCTCGGGGGCGCCGACC GTCCX3CCGGGT 
60 GQAGATCCA GTCGCTGCTGCGGTCCGTGTCCGCCGCGGGGGACATCATC GACGCGGCGAA 
CT CCCTCGCCCCGACCCATCCGGAGCTGGCAGAGTTCCTGCGGG TGTACGGGGCCGGGCG 
CAC CGCGGACGCCGTGGCGCACTACGAGTCCCACCAGAGCTGACGACCGAGGCCGGCCCC 
flG^^C GGACCAGAGCCTCATGAGGGACGGGGAGCGGACGOSGC T^CCATGGGTGAGG^ 
CG CCGGCCGGTACGAGCTGGTCGACCCGATCGGACGCGGAGGG GTCGGCGCGGTCTGGCG 
65 CGC CTGGGACCACCGGCGCCGCCGCTATGTGGCGGCCAAGGTGCTCCAG CAGAGCGACGC 
Gt^CACC CTGCTGCGCTTCGTCCGCGAGCAGGCCgrGOSGATCGAC^^ 
GGC CCCGGOSAGCTGGGCCGCGGACGACGACAAAGTCCTCTTCA CCATGGATCTTO^ 
CGGCGGATCACTCGCGCACGTGATCGGCGACTACGGCCCGCTCCCGCCGCGCTATGTGTG 
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CGCCCTGCTGGACCAACrCCTCrCCGGGCrCGCaK^ 

CCGCGACATCAAACCGGCGAACATCCTGATGGAGGCCACCGGGACGGGC 

GaSCCrroTCCGACTTOyKATCTCCATGCXXaAG^^ 

CTATGTCGTGGGTACXBCCCGGTTACTTTCCCCCCXSAGCAGGTC^ 

CTTCCCCGCCGATCTCTTCGCCGTaSGCCT^ 

ACCCXSACACCAAGGCCCTGGTGGACTTTCTTCACCGCCCAT^ 

GGGGATACa3GAGCCGCTGTGGCAGGTGCTCGCGGGG(^^ 

CCGGTTCa3TACGGCGACGGGGGCCCGGAAGGCCCTCGC<XK:CGCCGTGGAACT^^ 
CGAGAGCGGCCCCGACGACX3AACa3GTGGAGATATTCGACCAAC3X^^ 
GGGGTTCGGCCCCGGa3GCCCa3AGAACAa3CCGC<X!TCCGGTCT^^ 
CTCCGGTACC 

SEQ ID NO: 1 8 orf2par reverse complement 

ATGGCCACCACXSACCGCGAT^GCCATGCTGGAACGTCITCACCAG^^ 
GTCCGCCATTCTCTTOSACGAGGTCCSAAGGACTaSACTTCGTCCT^ 
TCGCCCXXSATCACCAACCGCCCCCTVGGCGTGCraCGC^ 
GCCCTGGACCGCyiGCTa^Tt^TCGCGCTGGCCGCGCAG 

CAGCACCGCCGTGATGGGCCCGCTGACCAAGTTCAGCGTCCAGCTOSAACGCGGC^ 

TCCTCy^CS^GCCGGATOSAGCCCGTGGGTCCCAGCTTCATCAGCCT^ 

ACCGACX3CCCCCCTGGTCCXKK3CCACaK:CACCCACXK:CCTGGAa3CCGACT 

GaKXSAGGCCGAGAACCCCCTCCTCGTCGTCGGTAGCGCOSTCATCCG^ 

GGCTGAAO^TCCCCGTCGTCACCACCTACy^CCGCCAAGGGCGTCCTGCCGCACGACCAC^ 

TACATGGACXSGCATTCTCGGCCACCCGGCCCTCGACGAGATCTTCGGCC^ 

CGAGGACCIXX:GCCCCrCCATGTGGACX3CGGGGCCGGGa::A^ 

TGTTCCGCGCCGACa^TCGACATCXSTaiCCT^CGTCGCCGAATTCGTCACCraaSCT 

ACCCGGCACGACCTCAGCGCCCTGCGCGCCCGCGTCGCCX3AATTCCTCX3CC^ 

CCAGGTGATCGACTGCATGAACTCCGTCCTCGACAACGGCACCrraaSTCAGCGACATa^ 

TCGCCAAGTCCGACCAGCCGTACXSGATTCCTCACCTCCGCGGGCI^CrCCAC^ 

CAGATCGCCCGGCCCGGCX3AGCCCGTCTTCOTC7^TCGCGGGCGAa3GCGGCTTCCACTCC^^ 

GCKSCCTGGGCCTGCCGATCGTCATGGTCGTaSTCAACAACGACCGCAACGGCCTG^ 

GCTCCCACGCCCCCGCCGTCGGCTTCGGAAGCGTCGACTTOSTCCAGCrCGCCGAGGCC^ 

GACCGCACCTCGCTGCTCGCCGCCCTCACCAAGGGCGCCGGACTCXSGCCGCCCGTTCCI^ 

CCAGTCOSGCGGTiyrCGCCGCCCTGGCCATCrGA 

SEQ ID NO: 19 orBpar reverse complement 

ATGCCCGGCCCCGACCTCGTGTACGGATTCCGGGTGCGCATCGGCACaSAGGGCCGCCCCGGCGGOSGC^ 
ACCCXKKilAGCGCACCCCGCTTaSCCGTCCGCGGGACCCATGTCCCCGTGCACGACG^ 

CCGTGACCCTGGGCCGTCCGCCCGTCCrGGTCGCCGACGGCCAGGTCCGGCTGCTCCTGGCGGGCGAGCTGTA<^^ 
CTGACCGGAGCGCTCGGCGGCTCCrCTGCCGCCCTCGGCGACGCCGAACTGCTGOT 

CTTCCGGCTCCTGAACGGACGGTTCGCCGCACTGCTCACCGACGCCTCCACCGGCGCGACCGTCGCGGCCACCGAC 

CGGTACCGCTGTGGCrGCGCGCa3ACGTGACGGGGCTGAGCGCCGCCACCGAGGCGAAGACCCT(^ 

CTGGGCCTGTCCGGCACCaVCACCCGCaSGGGGCGGCGGGCGTCTGCCGGGTCCCCXSCCGGGAC^ 

GGaSGCTCCGAO^TCACaKrCAGGGCGGTCCGCACCTGGACACCCCCGCT 

CCTG6Ta3Ga3AACX3CCTCXK:a^CGGCGGTCaX:ACCCGGCTGCGCGGCGGGGAG^ 

TaSACTCCGGGGGAGTOSCCGCCCa^CAaSGCGGCCCTGGCACCCGGGAC^^ 

TTCGACGCGGCCCGCTCGGTCGCCGTCCACCTGGGCACCXSaSC^ 

GCCCroGGCGGTCGCCGCCGCGGAGATGACCGACCCCACGGTCCTGGAGTACC^^ 

ACyVCCGGGCCGCTCCGCATCCTCa^CCGGGTACGGa3CCX3ACATCCCGCTCX3GCGGTAT^ 

CrrCGACGACGAGATCGCGGGCGACATGGCGGGCTTCGACGGCCTCAAaSAQATC 

GACCACCCACCCGTACTGGGACCGCGCGGTCCTGGACGOSCTGGTCTCCCrCGAACCCGGGCTCA^ 

AGTGGGTGTTGCGGCAGGCCCTCTCCGGCCTGCTGCCOSCCGAGACCGTGGCCCGCCCCAAGCTG 

ACCACCAGCGCGTGGACCGGACTGCTCCrCX3CCGAAGGGATCCGGCGCGACGAGGTGAC(^ 

CCTGTACGACGCGGTGGTCT^TCGACaVCGGTGCCGCaSGAGGACGTGGACTTCGG^^ 

QCAGGCTCAGGCTCCAGGGCCGGGTGGTCGTATGA 

SEQ ID NO:20 orf4par reverse complement 

GTGTCCACOKICGTCTCCCCGCGCTAaSCCCAACCGGCmCCTTCATGCGGCTGC^ 
GGTGGTCGTCGGCGCCCCGTACGACXSGAGGCACa^GCTACCXSGCCCGGCXSaS 

GCerGATCCACGGCGTOBGCATCGACCGGGGCCCAGGGGTCTTCGACCGGATaSACGTGGTCGACGGGGGC^ 
CCCTTCTCGATGGACCTGGCGATGGACACCGCGACraGTCGCCCTGACCCGGCrCCTGGAA 

CGGGGACCACTCGCTCTCCCTGGCCGCCCTGCGCGCCGTGCACGCCCGCCACGGCCGGGTCGCCGTCCTGCACCTGG^ 

GCGACACCAACCCACCCGTCTACGGCGGCACCTACCACCACXSGCACCCCCTTCCGCTGGGCCATCGAAGAG^ 

GAGCGCCTGGTCCAGGTCGGCATCCGCGGCCACAATCCGCGGCCCGACTCCCTGGACTACGCGCGCGGGCAC^^ 

CACCGCCGCCGACTTCACCCGGCGCTCACCGCGCGGCATCGCCGAGCAGATCCGGCGCACCGTCGGCGGCCT 

CCGTCGAGATCGACGTCGTCGACCCGGOSTACGCCCCGGGCACCGGaVCACCGGCCCCCGGCG^ 

ACCCTGCTCGACGTGGTCGGGCAGCTCAGGCCCGTCGGCTTaSACGTGGTCGAGGTC^ 

CTCCCTGCTGGCGGaSGAGATCGGGGCCGAACroCTCTACCAQTACG^ 
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CTCX:CCTGCCaiCCGGGGGCGGC»GCGGAOGAaK:c:^^ 
TTCGCCGGGCAGCGGTGGGGGTAG 

SEQ ID NO:21 cvin6 Polypeptide 

VPGSGLEAIiDRATLIHPTLSGNTAERIVLTSGSGSRVRim)GREYLDASAVLGV^ 

RAVELAAMjVGLSPEPLTRWFTSGGMGNEIALRMARLYHHRRGESARTO 

FIiTPPQPYI«EIjFAGSDVTDFa^IiRETIDRIGPERIAAMIGEPIMGAVGAAAPPMYWPR^ 

HWFAADHFGWPDIMVTAKGITSGYVPHGAVIiTTEAVADEWGDQGFPAGFTYSGHATACAVAI^^ 

KRIiAELSDLPIVGDTOOTGIiMLGN^LVARGTREPLPGAAVAEALRERAGII^^ 

PDGRVL 

SEQ ID NO:22 cvm3 Polypeptide 

VTRPPGLSAHTHGSVSGSLLRRVAGHYPTGVVLVTGPAEAPGQPPPAMVVGTFTSVSLDP^ 

liRAAGRPCVNVLGMJQGPVCRSFAGGDPGRWBVPYRTTATGSPV^^ 

VIREGSPLVFLRGDYGHWAGGGGSGRAGRRSAVCPV 

SEQ ID NO:23 orf6par Polypeptide 

MRASSPRGFRVHHGHAGIRGSHADIAVIASDVPAAVGAVFTRSRFAAPSVLLSRDAVADGIARGVV^ 
GPRGYEDAAEVRHLVAGIVDCTERDVIilASTGPVGERYPMSRVRAHIiRAWGPLPGADFDGAA 
RRARCGDATLIGVAKGPGTGPAEQDDRSTIiAFFCTDAQVSPVVIiDDIFRRVADRAFHGLGF^ 
GIAGRVDLVAFEQVLGAIiAIiDLVRDVVRDSGCGQAIjVTTO 

VAAVAGGHGDEGPGRSPGRITIRVGGREVFPAPRDRARPDAWAYPHGGEVTVHIDLGVPGRAPGAPT^ 
GYPRLGAGRAV. 

SEQ ID NO:24 orf4par Polypeptide 

VSTAVSPRYAQPATFMRIjRHRPDPIGHDVVWGAPYIX;GTSYRPGARFAPRAIRHESSLIHGVGIDRGPGVF^ 
VVDGGDIDLSPFSMDIiAMDTATVALTRLLEimDAFLMLGGDHSLSIi^^ 

GTYHHGTPFRWAIEEGLVDPERIiVQVGIRGHNPRPDSLDYARGHGVSIVTAADFTRRSPRGIAEQIRRTVGGLPIiY 
VSVDIDVVDPAYAPGTGTPAPGGLSSREVLTLLDWGQLRPVGFDWEVSPAYDPSGITSLLT^ 
ATTSPASAPVDSPLPPGATUUDDAENT^ENAVDAVDAESAVDFAGQRWG . 

SEQ ID NO:25 orGpar Polypeptide 

MPGPDLVYGFRWIGTEGRPGGGPGGHSEPGSAPRFAWGTHWVHDGTAYPLWSGTAVTLGRPPV^ 

liAGELYNRAELTGAIiGGSSAALGDAELLLAAWRRWGPGAFRLLNGRPAALLTD 

VTGLSAATEAKTIiAHEPGRPLGLSGTHTAPGAAGVCRVPAGTALIilJIGVGGSDITARAWTO^ 

vdlvgerlatavrtrlrggeaaptvvlsggidsggvaahtaalapgtrsvsmgtevsdefdaarsvavh 

IRIiHSAELVRELPWAVAAAEITDPTVLEYIJiPIiVAIiYRRIiDTGPLRILTGYGADIPIiG^ 

DMAGFDGIJMEMSPVIjAGIAGKWTTHPYWDRAVIJ)ALVSLEPGLKRRRGTD^^ 

GSGTTSAWTGLLLAEGIRRDEVTAVKGWVIARRLYDAWIDTVPPEDVDFG^ 

SEQ ID NO:26 orf2par Polypeptide _ 

MATTTAKAMLERLHQYGVDHVFGWGREASMLFDEVEGLJ^FVLTRHEPTAGV^ 

TNIiATGVATSAIiDRSSVIALAAQSESYDCTPNVTHQCIiDSTAVMGPLTKFSVQL^ 

GPSFISLPVDLLGAEIiNGTPTDAPLVRATATHAIiDADWRARIiDEAAEIiVREAEN^ 

ERIiNIP\nn'TYTAKGVLPHDHPliNYGAISGYNnDGILGHPALDEIFGPADLLI^ 

RVAPEVNPIPELFRADIDlVTISnrAEFVTAIiDDATSGIiAPKTRHDLSAI^^ 

SVLDNGTFVSDIGFFRHYGVLFAKSDQPYGFLTSAGCSSFGYGLPAAMAAQIARPGEPVFIilAGDGGFHSNSADIE 

TAVRLGLPIVMVVVNNDRNGIilELYQNIiGHQRSHAPAVGF^ 

RPFLIEVPVAYDFQSGGFAAIiAI 
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Claims 

1 . A 5. clavuligerus microorganism conqjrising DNA corresponding to one or more open 
reading frames essential for 5S clavam biosynthesis, wherein said open reading frames are 
disrupted or deleted such that the production of 5S clavams by said S. clavuligerus is reduced and 
5 clavulanic acid production is at least maintained, wherein the open reading frames are selected 
from: 

a) cvm6para (SEQ ID NO:l); 

b) cvm7para (SEQ ID NO:2); 

c) cvm6para and cvm6 (SEQ ID NO:5); or 
10 d) cvm7para and cvm7 (SEQ ID NO:6). 

2 A 5. clavuligerus microorganism comprising DNA corresponding to one or more open 
reading frames essential for 5S clavam biosynthesis, wherein said open reading frames are 
disrupted or deleted such that the production of 5S clavams by said S. clamligerus is reduced and 
15 clavulanic add production is at least maintained, wherein the open reading frames are selected 
from: 

a) cvm6para and one or more of cvml (SEQ ID NO:7), cvm2 (SEQ ID NO:8). cvmS (SEQ ID 
NO:9), cvm4 (SEQ IDNO:10), cvm5 (SEQ IDNO:ll), cvm6, cvm? or cvm7para\ or 

b) cvm7para and one or more of cvml, cvm2, cvm3, cvm4, cvmS, cvm6, cvm? or cvm6para. 

20 

3. An isolated polynucleotide comprising open reading frames selected from fhe group 
consisting of: 

a) cvm6para', 

b) cvm7para; 

25 c) cvm6para and cvm6; 

d) cvm7para and cvm7; 

e) cvm6para and one or more of cvml, cvm2, cvm3, cvm4, cvmS, cvm6, cvm7 or cvm7para; or 

f) cvm7para and one or more of cvml, cvm2, cvm3, cvm4, cvm5, cvm6, cvm7 or cvm6para. 



30 



4. An isolated polynucleotide comprising one or more open reading frames encoding one or 
more enzymes involved in clavulanic acid biosynthesis wherein said open reading firames are 
selected from the group consisting of: 

a) orfipara (SEQ ID NO: 12), 

b) orfipara (SEQ ID NO: 13), 

35 c) orf4para (SEQ ID NO: 14), and 
d) orf6para (SEQ ID NO:15). 
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5. An isolated polynucleotide con:q>rising one or more open reading frames encoding one or 
more enzymes involved in clavulanic acid biosynthesis wherein said open reading frames 
comprise one or more of: 
a) orflpara^ 
5 b) orfSpara^ 

c) orf4paray 

d) orf6para 

in combination with one or more genes involved in clavulanic acid biosynthesis selected from 
orf2, orfi, orf4,orf5, orf6, orfJ, orfS, orf9, orflO, orfll, orfll, orfl3, orfl4, orflS, orfl6, orfll, or 
10 orfl8. 



6. An isolated polynucleotide selected from the group consisting of 

a) a polynucleotide comprising the nucleotide sequence of SEQ ID NO: 16; 

b) a polynucleotide having the nucleotide sequence of SEQ ID NO: 16; 

15 c) a polynucleotide comprising the nucleotide sequence of SEQ ED NO: 17; and 

d) a polynucleotide having the nucleotide sequence of SEQ ID NO: 17. 



7. A vector comprising the polynucleotide of any one of claims 3 to 6. 
20 8. A S, clavuUgerus microorganism comprising the vector of claim 7. 



9. A process for improving clavulanic acid production in a suitable microorganism 
comprising isolating the polynucleotide of any one of claims 3 to 6, manipulating said 
polynucleotide, introducing the manipulated polynucleotide into a said suitable microorgansim 
25 and fermenting said suitable microorganism under conditions whereby clavulanic acid is 
produced. 



10. A process according to claim 9 wherein the polynucleotide is a cvm or cvmpara 
polynucleotide and the manipulation comprises disrupting or deleting cvm or cvmpara gene 

30 sequences. 

11. A process according to claim 9 wherein the polynucleotide is an orfor orfpara 
polynucleotides and manipulation thereof comprises insertion of the polynucleotide into vectors 
suitable for expression. 

35 

12. A process according to any one of claims 9 to 1 1 wherein the suitable microorganism is S. 
clavuligems 
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Abstract 

New processes for improving the manufacture of davams e.g. clavulanic acid. Novel DNA 
sequences and new microorganisms capable of producing increased amounts of clavulanic 
acid are also disclosed. 
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Fig.2 Orientation of cvm7 to published cvm cluster 




cvm7 cvm^ ^'"^ 

cvm2 
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